Friday, 10 December 2010

Never do Data-Entry

«I am sorry if I didn't write these days. Unluckly, I was very busy with a lot of work so I didn't pay much attention to other tutorials. Excuse me. I'll try to write something about next week»
It happened to me yesterday...

Data Entry is bad by design.
It's true, it's important, because without datas, a database is useless. But there are good and bad ways to insert datas. Good ways are:

  1. User-Based: users insert informations on specific designed application
  2. Data-Transfer: datas are moved from a data source (text file, excel, network, ecc.) to a destination database with a script
  3. User-Evolved: it's similiar to the user-based. It's more Zen, because in this model we admit «perfection is impossible, imperfection is normal, evolution is required». Premising we can't insert all datas, we'll delegate users to modify and to refine them. It's how Wikipedia works

Bad ways are (usually) more widespread, because data-entry is made to solve an informations hole without considering how the database will evolve, who will use it and how.

  1. One-Man-One-App: this is the "less worst" solution. A programmer is charged to implement a program to manage a database and to fill this database by himself. Though is still a bad solution, it's a bit more human because the programmer can write a program more suitable for HIS needs. More important, a single error doesn't compromise whole work
  2. Data-Source-Based: absolutely THE worst solution. Two or more users work to fill a common data-source (an excel file, a text file, ecc.)

One-Man-One-App problems are:

  1. A new "data-inserter" must be trained to learn how the hight-customized application works
  2. Move datas to another database more well designed could be difficult
  3. The official data-inserter doesn't always pay necessary attention on data-entry, because he's a programmer. Repetitive jobs make programmers angry and frustrated

Data-Source-Based is bad because
  1. Data-entry manifests in its worst way: "filling" an excel spreadsheet is one of most boring and frustrating activities on earth. Even a 100x6 table is hard to fill with care
  2. "Save button" is "Cumulate-and-Fire" anti-pattern incarnation. If someone forgets to Save, hours of works could be trashed
  3. Finding errors could be hard if you have more than 7 columns
  4. A clumsy operation (e.g. a "sort and save" just on a selection) could make datas inconsistent and ruin irremediably all the work done

If you're ALONE to fill a large database with datas wich came from paper or other non-scriptable sources, try to move yourself to a good way of data-entry. If it's impossibile, NEVER USE data-source-based: create a front-end even if you're working with an excel. This will allow you to have to avoid the "cumulate-and-fire" anti-pattern, will allow you to make a more comfortable way to insert datas and will give you a little fun when coding your script.

Anyway, data-entry is bad and boring. The only good way to do it is spreading it among three or four employed on a long timeline. 10 minutes per-day of data-entry is sopportable: two or more hours per-day doesn't.

No comments: