Logic, Flow Control, and Functions in R

R is a procedural programming language. Therefore, it has the full set of flow control syntax like many other languages. Indeed, the flow control syntax in R is similar to Java and C. In this post, you will see some examples of using the flow control syntax in R.

Let’s get started.

Logic, Flow Control, and Functions in R
Photo by Cris DiNoto. Some rights reserved.

Overview

This post is in three parts; they are:

  • Finding Primes
  • The Sieve of Eratosthenes
  • Sum of the Most Consecutive Primes

Finding Primes

Let’s start with a simple problem: Find the list of all primes below a certain number N.

The first prime is 2. Any integer larger than 2 is a prime if it is not divisible by any prime less than it. This is a simple definition. We can convert this into R program as follows:

If you can run it successfully, you will see the following output:

The algorithm of the above code is as follows: You scan from 2 until pmax (includes both ends) and for each number i, you use another for-loop to check if any existing prime j can divide the number in concern. If i %% j == 0, you know that i is not a prime. Hence you mark isPrime as FALSE and stop.

The primes are appended to the vector prime at the end of each iteration. This will hold all the primes below the upper limit when this program ends.

From the above, you see some basic R language features. Conditional branching in R has the syntax:

This syntax is like JavaScript, even that the semicolons to mark the end of each statement are optional.

The conditions are supposed to be Boolean. Hence we can use the logical variable isPrime above, or a comparison statemenet i %% j == 0. The operator %% is for modulus division. You can find the table of common R operators and their precedence as follows:

You can find this table in R using the help statement “?Syntax” with uppercase S in “Syntax”.

In C and Java, you may recall there’s a ternary operator “condition?value_true:value_false”. This is an operator because its use is limited to return a value (either value_true or value_false based on the truth value of the condition), rather than executing a large chunk of code. The similar can be found in R as a function:

But you should not confused with the if-else statement.

Furthermore, you can use nested if in the similar syntax as C or Java:

However, you do not have switch statement in R. Instead, switch() is a function with the syntax like the following:

You also see in the previous example how a for-loop in R is created: You need to provide a vector and the loop will scan the vector elements one by one. It is not required the for-loop is to iterate over integers, the code above is just an example.

When you’re in the loop, you can always terminate the loop early using the break statement, or start another iteration early using the next statement. Another example is as follows.

The Sieve of Eratosthenes

The previous example of finding prime is slow if you set the limit to a higher value (e.g., one million). A faster algorithm would be the Sieve of Eratosthenes, at the expense that slightly more memory would be used. The idea is to find one prime at a time, and upon a prime is found, all its multiples are excluded from the list of prime candidates.

The implementation of the Sieve of Eratosthenes in R is as follows:

This code should produce the same output as the previous one.

In the code above, you see how you used next and break statement to control the flow inside a for-loop. You can also see how to use rep() function to create a vector of identical values (TRUE) and to use seq() function to create a vector of uniformly-spaced values from i*i to pmax.

At the end of the code, you used the which() function to find the indices where the vector’s value is TRUE. In R, vectors are indexed with 1. Hence the vector primality is created with first element set to FALSE (since 1 is not considered prime) before the for-loop started.

There are a lot of built-in functions in R. The code above shows you a few and you can learn some of the most common functions from the “R Reference Card”.

Sum of the Most Consecutive Primes

Writing a program as above is useful for many projects but when you run into a larger problem, you may want a way to structure your program into functional blocks. R supports not only built-in functions, but also allows you to create your own function.

Let’s consider a slightly larger program. This is the problem 50 from Project Euler. You want to find the prime below one million that is a sum of the most consecutive primes. For example, the sum of the first 6 primes is 2+3+5+7+11+13=41 and 41 is a prime. The solution is 997651, which is the sum of 543 primes.

As you have a way to generate primes up to a million, you can scan the vector of primes and find the sum, then verify if the sum is a prime as well, up to the point that the sum is below one million. At the same time, you need to keep track of the longest sum that fits the criteria.

Following is how you can solve this problem in R:

You can see that a custom function is built to return the list of all primes. A function is defined using the function() syntax and with a return(). When you call the function like primes <- getprimes(pmax), whatever passed back by return() is assigned to the variable.

The rest of the code above should be familiar to you: They are built with for-loop and if statements. You should also see how the answer is recorded and updated in the loop.

One subtle issue you should pay attention: In the for-loop on i, it is up to length(primes)-1 while the for-loop on j starts at i+1. This is to make sure we calculate the sum correctly because in R, it is possible to create a vector in a syntax such as 5:2 or 5:5, which is a descending sequence and a single element vector, respectively.

If you run the code correctly, you should see the following output:

Which tells you that 997651 is a sum of 543 primes.

Further Readings

You can learn more about the above topics from the following:

Website

Books

Summary

In this post, you learned from examples on some R programming syntax and how to define your own R functions. Specifically, you learned

  • How to create loops and branches
  • How to control the flow in loops using next and break
  • How to create and use a custom function

No comments yet.

Leave a Reply