Fang Talks

22 12 15

Huge data

Not to be confused with “Big Data”, which is actually a field in computer science.

For some kinds of applications you need to work with incredibly large amounts of data. Data that needs to be processed, evaluated or enhanced… programatically. If you have a text file of a couple megabytes, sure, no problem. But try one of a couple of gigabytes, and the process will start to take considerably longer. Especially on your low-end laptop.

It’s really cool to see how many useful things can be done with a large amount of data, but working on the driving force behind it can be a bit painful at times. Ideally you’d just test as you go with small amounts of sample data. Sadly, that isn’t always possible. In some cases you need a large amount of results to proceed and verify all parts are functioning properly, and then you have to spend hours waiting for it to finish, hoping it doesn’t crash halfway through or, even worse, produce incorrect results.

Luckily I have plenty of things to keep me occupied in that time, but imagine what this is like for those researchers working with terabytes of data. Sure they have more powerful computers, but they’re also doing much more complex things on them. Analyzing a 10 TB database can take a long time, depending on what you do with it.

“Hey boss I’m taking a couple weeks off again, we’ll know if I fixed the bug by the time when I get back.”
~ Fang

Post a comment

Your email will stay hidden, required field are marked with a *.

Experimental anti-spam. You only have to do this once. (Hint: it's "Fang")