Programming 101s : Boolean Operations and Other Dangers

sibomots · February 18, 2018, 6:12pm

I’ve been following the threads here and IRC channel for quite some time and I’ve seen some really interesting questions fly by. Those are good. I like seeing complex programming questions. But then once in a while I see a series of questions fly across the screen that make me shake my head. I’m eager to help and so this post is hopefully not the first in the series of 101s. Let’s look at boolean logic:

Boolean logic, or Boolean algebra is about expressions where the constituent parts resolve to truth values of TRUE or FALSE. It’s best to leave alone the detail of how a computer represents a TRUE or FALSE value. Not every system assumes TRUE is non-zero or that FALSE is zero. That’s platform dependent.

If your language (Java in this case) provides a native truth value for FALSE or TRUE, then use it. It’s no accident that Java does so.

Boolean algebra terms

predicate – A predicate is the left side of a Boolean expression where we think of the expression as a function. Given some function f(x) we would call this a predicate f on x The result or value of the predicate is either true or false.

axiiom – A true statement.

theory – A set of all statements that can be constructed from the axioms by applying inference rules.

proposition – A statement that contains no variables. They are always true or false.

Example

Axioms: “All fish are green”. “All sharks are fish”
Theory: “All sharks are green”

We can find ourselves in trouble if we are not consistent with our axioms:

Axioms: “All fish are green.” “Not all sharks are green”
Theory: “Not all sharks are fish”

Propositional statements:

1 + 1 = 2 (always true)
2 + 1 = 4 (always false)

Non-Propositional statements:

x + 3 = 1 (may be true or not, it depends on the value of x)

x * 0 = 0 (always true, but not a proposition because of the variable)
x * 0 = 1 (always false, but not a proposition because of the variable)

Mathematical Logic Operators

The operator not is performed with a bang !.

if (! expression)

If expression resolves to a true value then the not of that expression is false.
Likewise if the expression resolves to a false value then the not of the expression is true

Let’s suppose your expression is a function:

if ( ! func() )

Let’s suppose that func() does something which allocates a resource. (performs an operation that modifies a value). The effect can be a problem for your code:

Did you really intend that no matter what the resolution is of the boolean value of func() that the resources allocated (values get modified) be performed regardless of the truth value? Likely this is unwanted. It’s a bug.

It is better to not use function calls inside of conditional statements because of the side-effects that can occur. The side-effect is that func() can allocate resources and those resources will be persisted until the resources go out of scope. Perhaps they will never go out of scope until the program is finished.

This is an aside, but it is wise to use functions that do not modify the object if they are used inside a conditional statement. In other languages (not necessarily Java, but Java does provide a syntax to declare a function as read only – i.,e., a function that does not modify the object) using a const (read-only) function is preferred when used inside a conditional.

Let’s go further with the not operator:

if ( ! expression )
{
    // do something if expression is `false`
}
else
{
    // do something if expression is `true`
}

This is a preferred series than this which is wrong:

if (! expression )
{
     // do something if expression is `false`
}
else if ( expression )
{
     // do something if expression is `true`
}
else
{
    // this block of code will NEVER get reached.  The compiler
   // will optimize this block OUT of the code, but the literal code will be difficult
   // to understand at first and confuse readers. It'll confuse you too after a while.
}

Compilers are usually good enough to optimize a cascade of conditions where multiple expressions may be true for flow of control to go into blocks. Consider:

if ( expression )
{ 
     // do something if expression is true
}
else if ( other_expression )
{
    // do something if expression is false AND other_expression is true
}
else if ( yet_another_expression )
{
    // do something if expression is false AND other_expression is false AND yet_another_expression is true
}
else
{
     // do something if expression is false AND other_expression is false AND yet_another_expression is false
}

That kind of code is prone to bugs. All one has to do is re-order the if conditions and the flow may be different when it runs next time after the change.

If other_expression is true AND expression is true the flow depends on which one is tested first. Sometimes this is warranted. Most of the time it is not. It’s a poor design.

expression and other_expression may be related or they may be unrelated. Couple that mistake with using functions inside the conditional statement then you run into problems of resources each of those expressions may allocate which depend further on the outcome of previously tested (and executed) functions.

The fix is to think of the expressions as axioms. They are either always true or always false. Then describe the truth table of each expression as follows: (The term (#) refers to the block in order from the cascade of if statements)

Expression (1)	Other Expression (2)	Yet Another Expression (3)	Result
F	F	F	4
F	F	T	3
F	T	F	2
F	T	T	2
T	F	F	1
T	F	T	1
T	T	F	1
T	T	T	1

There are four conditions where one of the 2nd or 3rd expressions are true yet the blocks for those conditions are not executed.

Is that intended?

Check your program and make sure that the order is important. Try to make it so the order is NOT important.

AND, OR

As we read left to right, these expressions have a meaning:

if ( x && y )

It means if x is true and if y is true, then the whole condition is true.
If x and y are axioms then it won’t matter which order they are put into the code.

Suppose x and y are not axioms. Suppose they are functions (at least one is a function)

if ( func() && y )

or

if ( y && func() )

Take the first example:

if ( func() && y)

You may have assumed that if the result of func() is false, then the whole expression inside the if statement is forced to be false and therefore the code will NOT test y. That is not a good assumption to make. In many languages, maybe in Java, but it doesn’t matter if Java does – there is no guarantee that the order of expressions tested is the same as the order they appear in the if statement. Think of the disaster from the second example:

if ( y && func() )

Perhaps func() modifies a value, but you expected it to only be properly modified if y is true. If the order is not guaranteed then func() may be called before y is tested. This is another bug.

Slightly better:
(Assume that y is not a boolean varaible, but an expression that resolves to true or false)

boolean b_y = y;
boolean b_func = false;
if ( b_y )
{
    b_func = func();
}

if ( b_y && b_func )
{
    // do something
}

OR

boolean b_func = func();
boolean b_y = false;
if ( b_func )
{
     b_y = y;
}
if ( b_y && b_func)
{
    // do something
}

Why is this better? First you control the order before the if statement. You are not calling func() until you should. Then you test the result as axioms.

The compiler may even optimize it by noticing that the second test of b_y occurs and so it re-orders the code to test b_func if b_y was true.

The lesson here is: Do not depend on the order of execution inside an if statement, Code that does is defective code and prone to errors and bugs. A simple innocent “refactor” task can make the code seriously flawed.

The exact same statements can be made about the logical OR (||) operator.

Well Known Reductions

De Morgan’s Laws

All computer science students take a course in discrete mathematics. The mathematics of logic, set theory, and proofs.

Here’s one of the first reductions we learn:

! ( A || B ) is the same as (! A) && (! B)

The second:

! ( A && B ) is the same as (! A) || (! B)

Last Notes

In Java it is not possible to make this side-effect because the ! operator works on boolean values. But in other languages, like C, or C# or C++ this would be a legal statement:

int x = 5;
int y = !! x;

What is the value of y ?

It’s 1

Why?

! 5 = 0
! 0 = 1

In languages like C, C++, C# this can be a useful way to generate a quick and dirty “truth” value from a non boolean value. But in Java it’s simply not permitted.

`int x = 5;`
`int y = !!x;`

This should generate a compiler error, for good reason.

More last notes

Warnings are generated by compilers. Sometimes we can flag the compiler to suppress the warnings. Sometimes we can flag the compiler to warn us for anything that is minor. Sometimes we can flag the compiler to stop compiling or treat warnings like errors.

My advice:

Flag your compiler to treat all warnings like errors.

Warnings indicate things that might be a problem. If the compiler thinks it might be a problem, it might be a problem! You don’t want warnings to get compiled into your final code. Before you ship your code, re-build it with static analysis and warnings-as-errors turned on. Clean up all the warnings.

If you found this post helpful, post as much. If you want more posts like this, let me know. I may turn it into a video. If more than 5 people post as much, I will.

RandomByte · February 18, 2018, 6:22pm

Nice guide!

But I disagree with this:

https://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.23
and
https://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.24

I always rely on the order of if conditions.

sibomots · February 18, 2018, 6:26pm

I believe you are correct about the Java specification. It’s unfortunate that the Java specification says that.

It leads to a bad habit to write code in Java and then transfer the habit into a language where the order has no guarantee. The habit do it leads to problems later in your career.

If you always rely on the order of expressions in an “if” statement then your code will never work as intended on a variety of platforms and languages.

Break that habit.

Seriously, if you program for a living, do not think the order is something to ignore. In C/C++ there is no guarantee that the order is how you wrote it in code. I’m pretty sure C# is also the case. As MANY other languages you might write code in.

It’s not a debate point, just don’t do it. Your employer and fellow developers will thank you.

RandomByte · February 18, 2018, 7:17pm

Ok, right, thanks for the clarification.
I only use Java intensively.

pie_flavor · February 19, 2018, 12:22am

Actually, it’s guaranteed in C# and guaranteed in C/C++ as well. In fact it’s guaranteed in (AFAIK) every programming language which uses the && operator, because that is the entire point of the && operator. It’s short-circuiting - it evaluates the first argument, and if it can resolve the expression immediately, it does so. That’s what I’d expect to see in boolean statements 101. The operator where the order should not matter is the & operator.

I’m sorry, but you’re just directly wrong here.

mcmonkey · February 19, 2018, 12:54am

Short-circuit evaluation is a tightly specified feature of almost every language of this styling. If it fails to work as expected, a TON of major-scale projects will cease functioning as they all rely on it being stable.
While I agree it’s helpful to be clear in cases where it’s very important and needs to be rapidly understood, in most coding situations it is better practice to trust that short-circuit evaluation will always be used.

C, C++, and C# definitely all use short circuit evaluation.
You might want to carefully review what you’re claiming in a “101 tutorial”, as there’s visible errors, especially in unnecessary side notes. You claim that C, C++, and C# can do “!!x” on an integer. I believe C and C++ can do that fine (by technicality of booleans just being handled as integers in C anyway) but in C# that won’t work. C# has strict typing including a dedicated “bool” type separate from “int”. An integer can’t be negated to form a boolean, and a boolean doesn’t implicitly convert to an integer.
(Though, of note, in C# you can add your own custom implicit type conversion logic to make things like that valid and possible.)

EDIT: Opening paragraph, is “I’m eager to help and so this post is hopefully not the first in the series of 101s” correct? I think that may be written wrong by accident.

Oh also to be clear: I agree that you should generally avoid performing modifications inside an if (…) area whenever possible. Just by the general logic of separating read and write.

ryantheleach · February 19, 2018, 1:02am

This is true IMO regardless of the specifics of whether some languages leave the order implementation defined.

Reading code becomes much more difficult as soon as order of operations are important, and within a branching statement especially. I’ve seen several bugs in Sponge itself caused due to people not understanding relatively simple if statements with ordering, because the compiler inlines things, and the decompiler removes unneeded parenthesis that explicitly state the ordering.

Always prefer over specifying the ordering, even if you do end up using short circuiting side effectual code.

Always prefer pulling out the expression into multiple named variables, the compiler will likely inline them anyway (assuming you don’t change the order of side effects).

Katrix · February 19, 2018, 6:08am

Often times when you rely on short circuiting though you can’t pull out the expressions into varaibles as they should only be evaluted after the first test passed. null is a typical form of this. Empty list checks are another one. When you use it, you check if it fullfils the precondition, and then you do the real test. As others have said, the whole point of || and && is short circuiting. We have other operators to use when we don’t want short circuiting.

Valandur · February 19, 2018, 9:59am

I certainly think we need more tutorials on Sponge

I do however have to agree with other comments here - ordering within conditional statements is relied on in various programming languages. Wether one should use them is another debate.

A further note is that axioms are not necessarily true statements. They are statements that are assumed to be true, and work as the basis of further discussion.

And as @mcmonkey already mentioned, avoiding functions with side effects is generally a good idea, but that doesn’t mean functions in if statements have to be avoided altogether - java generally uses getters, which are functions but should be side effect free.

sibomots · February 19, 2018, 5:12pm

It seems my information was out of date. I definitely was incorrect about the statement that order of operations in an if-statement condition was vendor specific. That appears to be a false (pun intended) statement.

My apologies.

In practice however, the code that I deal with on a daily basis would be and is always written in such a way that the order of operations is never even an issue because of the danger of having a function called in the if statement expression causing a side-effect (modification of data elsewhere). So to avoid that risk, we choose to keep the expressions in an if-statement as clean as possible as I’ve described originally.

I made a typo that an axiom was always true. I meant that it is always true or always false. In other words, once it’s true, it is always true, or once it’s false, it’s always false.

1+ 1 = 2 (always true)
2 + 2 = 4 (always false)

That’s what I meant to write, but I had not proof-read my post as clearly as I should.

The rest of it, I guess one can debate it, I choose to write code the way I documented because very often I’m not going to be the one to maintain it, Also, I sometimes inherit code from other developers to fix bugs on and they are expected to likewise write code in clear methods that have zero ambiguity.

Java is an excellent language to teach modular/OO programming. I see it being used much more often in introduction CS courses. Back in my day taking CS classes, the language simply didn’t exist and the most OO language we had that was main stream was C++. The standards for C++ have evolved much over the last 30 years.

I still stand by my statement that

if (func() && y)

is dangerous because of the short-circuit and side-effect of an innocent but fatal re-factoring to:

if (y && func())

Caveat reader

Thanks for the comments. I’ll do another topic if there is interest.

sibomots · February 19, 2018, 5:15pm

I couldn’t remember that C# prevented that. It makes sense. It’s strict on typing as you stated. My mistake. I seldom do C#, but have had to maintain some code in C#. Upon reflection it seems that C# shares a large quantity of themes of Java to some extent if you will.

Thanks for the comments.

sibomots · February 19, 2018, 5:19pm

As others have stated, I was incorrect because of the rules of && and || and likewise operators.

Baring those, then it could become an issue. I’ll try to generate some assembly code generated by different functions to show (and prove to myself) that this could be an issue with other operators.

Further this isn’t an issue with Java because I don’t believe you can overload the && and || operators, but in C++ you can. If they are overloaded incorrectly, then all bets are off.