Figuring out binary datalog formats without a specification

//Figuring out binary datalog formats without a specification
Figuring out binary datalog formats without a specification 2018-06-15T16:08:09+00:00

The Challenge

Being in the realm of semiconductor data with a wide range of customers, companies often throw interesting technical challenges at us. The most complex one so far this year is probably a request (OK, a requirement!) to interpret binary test datalog files so that they can then be analysed from our yieldHUB database system.  The company provided us with little information on the actual binary format.  We had to interpret the binary files based on their equivalent ASCII files and our knowledge on how datalog data is usually stored.

We needed to parse the  binary files directly because generating the ASCII version is a manual operation an desktop software.  It could take thousands of man hours to convert their existing archive of the binary files into a format compatible with entry into our database.

After several days of combing through the binary files searching for patterns and the values that we know from the ASCII version, we were able to write our own converter for the binary files.  Our converter was written as a command line program so that we can integrate it into the automated datalog processing and allow our customer to analyze their historical data using the tools in yieldHUB.  The journey was extremely challenging but when we finally discovered all the needed information, it felt all the more satisfying.

Lessons Learned

In this project, the key lessons we learned are the following:

 — the “endianness” of the binary files need to be understood first.  We got stuck into a confusing trial and error method because some files were little endian and the others were big endian.  It was unexpected that a single system can have both but it did happen in this case.  

 — the script or software used to read and interpret the binary files need a way to specify the endianness of the binary data.  We were initially using an older version of the scripting language to detect the numbers and we never found the correct values until we realized we needed to use the newer version.

— The 8 byte floating point value 0.12 may not be exactly 0.12 in the binary representation but could be 0.11999999999.  Because we were aware of this from the start, we did not look for floats or doubles at the beginning but rather for ASCII text and integers.

Does your company have such an archive of data that you can’t interpret or analyse any more?

— A good visualization tool is key to finding the patterns in the binary data, especially for the part of the data that represented multiple records.  The tool we used allowed us to see the obvious patterns very quickly. It also aided us in validating byte positions and block sizes for the multiple record data.

0
yieldHUB engineers in the project
0
Number of Countries
0
Weeks from start to finish

Customer Testimonial

“This allows us to process ten years of data into yieldHUB from an old obsolete tester archive we could not otherwise read from”

Director of Engineering – Texas

More Details

Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi.

Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet.

Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque
sed ipsum.

Can yieldHUB Help You?

Contact yieldHUB today! Our global sales and support team will be happy to help.

CONTACT US