In a previous article (Tips for Constructing Lab Databases in Excel by Emily Crow), BitesizeBio readers began a brief, but spirited commentary on the application of using true databases (MySQL, Access, etc.) versus Excel “databases”.
While Excel can be quite useful to organize information (for example, an inventory of reagents, plasmids, laboratory items – even your own experimental projects), true databases are superior when it comes to storing, retrieving and querying data, integrity and comparisons between two or more sets.
But, how to transfer your information from your Excel worksheets into a database? Some may not care for the responsibility that database administration entails. Others may find that the task begins with a steep learning curve. You don’t have to be an IT wizard; work with your informatics group to make your Excel worksheets database-ready.
Know your Bioinformatician
A bioinformatician may already be the person administrating everything behind the scenes. He or she collects various project files and combines their contents into a single file. A good working relationship with your informatics group is beneficial at all stages, from the start of new projects, to projects in motion and all the way through to final analysis. If you have not already done so…introduce yourself and say hello to them!
Know the table structure
Your bioinformatician designed the database from the ground up (and maybe even asked for input along the way). They know the structure inside and out and can provide you with a list of category names for each data set. Using this, you will get a very clear idea of what should and should not be included into the database.
Organize your worksheet
Now it’s as simple as copying/pasting columns of your Excel data over to a fresh new worksheet. Use category information as a guide to arrange everything into an exact order. To make things cleaner, exclude extraneous entries, personal comments unrelated to projects and any special formatting.
Save and submit your database-ready worksheet
Save your reorganized data in Excel file format. Or select tab- or comma- delimited. Do note that for character delimited formats, each worksheet must be saved individually. This method is simple enough as it can enable your bioinformatician to import the data directly, with very few additional modifications.
Why Databases Are Useful
* Centralized data – the database exists in a single location where it is backed up regularly. No more searching computer after computer for worksheets that may or may not exist!
* Primary keys (unique identifiers) – this ensures each entry in your database is unique, even if you have a multi-part project where every part begins sample labeling at number 1.
* Relational Databases – compare multiple data sets to standard libraries to see which entries are identical.
* Data integrity – query the data, then query again and again. The original data is still there. In Excel, it is all too easy to sort only a portion, format entries based on the wrong type, or even accidentally edit the cells. Undo/redo your actions? Unless you saved an untouched, duplicate copy it can be difficult to start at the beginning again.
Whether you use Excel to manage information, have your own customized database on the side or even administrate all of your laboratory’s data, we would love to hear about your experiences. What works best for your lab? And what were some of the challenges along the way? Please share your story with us, in the comments below.