How to choose between small file size and simpler code?
Posted: Fri Oct 11, 2019 11:26 pm
How do you choose between minimizing the disk space for data files, versus making the code much simpler?
I'm writing an accounting app, and thought I'd store ID's for the accounts rather than account names, because "27" is shorter than "Chase Checking". (Of course, there's another table that ties the account numbers to account names.) Using ID's instead of account names saves only a few extra bytes for one entry, but multiplied by at least three entries per transaction, and thousands of transactions, it adds up. But does it add up "too much"?
Here's a sample line of data with account numbers:
And here it is with account names:
For 3000 transactions, which each transaction saved into an SQLite database 3 times (trust me on that one), and assuming perfect efficiency of the SQLite storage mechanism (because I don't know how to calculate SQLite's overhead), that's 765k for storing with IDs, versus 976 Kb for storing the names. In percentage terms, account names takes quite a bit more space (28%), but in raw terms, an extra 211 Kb just isn't that much. So, I'm strongly considering ditching account numbers and just using account names.
It's not a problem if someone changes an account name, that's a simple change to the database to update the affected records.
I'm not worried about the data file growing ever larger year after year, because different years can be stored in different files. In fact, that's one reason I'm writing this software: the commercial app I was using insisted on keeping all years in a single file, and since that single file recently got corrupted, I'm stuck with having to re-enter thousands of transactions for six years' worth of data.
I'm writing an accounting app, and thought I'd store ID's for the accounts rather than account names, because "27" is shorter than "Chase Checking". (Of course, there's another table that ties the account numbers to account names.) Using ID's instead of account names saves only a few extra bytes for one entry, but multiplied by at least three entries per transaction, and thousands of transactions, it adds up. But does it add up "too much"?
Here's a sample line of data with account numbers:
Code: Select all
10/21/19 1199199600 356.20 05 16 SomeHost: Registration & webhosting, 2 years
Code: Select all
10/21/19 1199199600 356.20 Chase Checking Office Expense SomeHost: Registration & webhosting, 2 years
It's not a problem if someone changes an account name, that's a simple change to the database to update the affected records.
I'm not worried about the data file growing ever larger year after year, because different years can be stored in different files. In fact, that's one reason I'm writing this software: the commercial app I was using insisted on keeping all years in a single file, and since that single file recently got corrupted, I'm stuck with having to re-enter thousands of transactions for six years' worth of data.