Ruby on Medicine: Scrolling Through Large Files

Originally published at: http://www.sitepoint.com/ruby-medicine-scrolling-large-files/
Ruby on medicine

In my tutorial Handling Large Files, we saw how to use Ruby to extract some portion of text from a large text file. The file from that post is VERY large (3.3 GB!), and now it’s time to improve the approach we used in the past tutorial a bit.

Today, rather than extracting a portion of the large text file, we want to navigate through it. In other words, we want to scroll through that large text file smoothly, the Ruby way!

So as not to reinvent the wheel regarding terminology and the file we will be working with, please see the sections “Terminology” and “Obtaining the file” from Handling Large Files. The terminology section will walk you through some concepts that are useful for this tutorial. The latter section shows you where to get the large text file, since this the toy we will be playing with in this tutorial.

Let’s get started.

Read, Read, Until You Get Tired…

We have a very large text file under our belts now. You’re probably used to text files being the smallest files. This, however, is not the case when it comes to Genomes. When you open a text file related to Genomes, expect that you will read a lot, and I mean a lot!

As mentioned above, rather than extracting some portion of this large text file, we want to navigate (scroll) smoothly through the text file. The idea is that, instead of obtaining portion by portion as was shown in the last tutorial, you may decide that you just want to keep scrolling until you get tired of the process.

The previous article demonstrated how the text editors I used for opening the text file just went crazy. We need Ruby at this point.

Let’s write a script that the user can run at the command line to specify the file and handling our smooth scrolling. The first thing we ask is the user to give us the file name, which will be stored in a variable. Thus, open a blank file and add the following:

puts "Enter the file name you want to scroll through"
file_name = gets.chomp

gets is a method that gets the user input as a string. chomp is used to remove \n, which you obtain when pressing the enter/return key.

Great! We have read the file name. Now, it’s a simple mattter of opening that file. This is done as follows:

input_file = File.open(file_name,'r')

The file is opened in read mode, which is specifed by the r.

After opening the file, let’s go through that file line-by-line. Since we want to navigate (scroll) through the text file, it would be a good idea to display the output chunk by chunk. In other words, display a specific amount of text, and then ask the user to press any key to continue scrolling or type EXIT to terminate the program.

A Ruby method that comes in handy in this step is each_line, which reads each line from the text file. We can do the following:

input_file.each_line do |line|

In order to keep navigating (scrolling) through the text file, we’ll use a while true loop, which is an infinite loop. Howerver, we’ll add a conditional (i.e. if) statement to check the user’s response and exit as needed.

I mentioned above that we would output the text in chunks. Let’s say that the chunks are 100 lines each. In this case, after displaying those 100 lines, ask the user to press any key if to continue reading, or enter EXIT to, well, exit.

Thus, we can add the following if-statement for this case:

if response == 'EXIT'
exit
end

As for continuing to scroll, the user can press any key. We’ll need to save some state so we know when to stop and prompt the user to continue. Each time we stop and ask the user to take action, we can reset a counter to trace the number of lines displayed. When the counter reaches the value 100, this means that 100 lines have been displayed, and it is time to prompt the user to take action. At the same time, we reverse the counter to 0 to keep trace of the lines displayed.

I will show the entire script in the next section.

Putting It Altogether

Here’s our fancy, new Ruby script:

puts "Enter the file name you want to scroll through"
file_name = gets.chomp
input_file = File.open(file_name,'r')
counter = 0 # used to keep track of the number of lines displayed
user_input = ' ' # stores input from user
while true
  input_file.each_line do |line|
    print line
    counter = counter + 1
    if counter == 100
      counter = 0
      puts 'To continue scrolling, press any key...'
      puts 'To terminate, type EXIT and press enter'
      user_input = gets.chomp
      if user_input == 'EXIT'
        exit
      end
    end
  end
end

Running the Program

In order to run the above script, type the following at your command line (assuming the file name is scroll.rb):

ruby scroll.rb

You will be prompted to enter the file name, which is in our case hg38.txt:

When you run the program, the first 100-lines will be displayed and you will get a prompt asking you to continue or to terminate. The figure below shows the third page (scroll) displayed:

If you type EXIT instead of any key, the program will terminate and you’ll be free to go on your merry way.

EXIT

As we saw in this tutorial, Ruby enables us to scroll through very large text files smoothly and easily. Now, we are not constrained by the weaknesses of text editors when dealing with large files.

Do you think that this idea could lead to building a Ruby based text editor? What benefits do you think such an editor could provide? Do you think it would be high performance? What other issues would the editor need to handle?

In the next article in the Ruby on Medicine series, we will go a-hunting for the elusive Gene Sequence. Stay tuned!

Continue reading this article on SitePoint

I’m not really sure, but is the result of your programming work any different from running less <filename>?

Anyhow this is a nice project for Ruby starters, and I think it makes sense in a way. But still reminds me of the old days, where everyone was building their own homebrew solutions to simple (and already solved) problems.

1 Like

I do not disagree with your assessment.

Ruby is very good at parsing/manipulating strings. And, in the context of a larger application (or DSL), this functionality would be fundamental and very handy. As an illustration this serves a valid purpose.

I agree with ParkinT. In the article before this one in the series, @abderhasan makes a point about how many folks in the medical field are new to Ruby, so some of the things we think are “old” or “common” are brand new to them.

Thanks all for your nice comments, which I was very happy reading, and they will surely improve the series. Yes, as @ruprict mentioned, the series, in addition to being targeted to Rubyists, it is also intended to researchers and doctors for instance, where what professional Rubyists may see trivial, researchers and doctors may need to know and learn. There’s more to come in this series, and always looking for your kind feedback and comments.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.