| The previous 2 days i was working towards the following idea, but i spend so much time switching between projects.. i put it on the backburner.. i NEED to finish busybox apt-get…. anyway, the idea as follows.
All that is needed to check if new packages are out is a list of all package names and versions. Specifically we dont need to download descriptions of the packages just to see if its new.
So firstly we need only concern ourselves with 3 pieces of information about each package, here are some stats for sid
number of unique package names, 8233
number of unique versions, 2150
number of unique revisions, 111
i.e. each package has a unique name, but not each package has a unique version or revision, in fact there is a lot of duplicate versions and revisions.
So.. to store information about a current release we need to 4 tables/files
A file with just unique package names, for woody this is
94074 Bytes uncompressed
35644 Bytes compressed with bzip2 -9
41332 Bytes compressed with gzip -9
A second file with unique versions, this is
16390 Bytes uncompressed
6460 Bytes compressed with bzip2 -9
7052 Bytes compressed with gzip -9
A third file with unique revisions, this is
564 Bytes uncompressed
357 Bytes compressed with bzip2 -9
362 Bytes compressed with gzip -9
These tables wont be changing all the time, the names table will change everytime a NEW package is added or an existing package is removed, but not when a package is updated.
The version and revision table would change less often.
To pull all this information together we need a forth table which has three entries for each package, the entry number in the name table for the package name, the entry number from the version table for the version table and the entry number from the revision table for the revision number.
I havent generated this file yet, but it will need exactly 5 Bytes for each package entry, 2 bytes for the name number, 2 for the version number and 1 for revision number.
So we need a min of 5 x (aprox) 8000 == 45kB to represent the package status of sid.
On top of that we could do a binary diff using xdelta (its a package) to represent changes between the tables.
I planned on storing the md5sum of each of the three dependent tables in the packag table to prevent them getting out of sync.
So to update you would have to sync your 4 files and rebuild the full package names as strings and compare them to your available file. Youy could then be presented witha list fo packages that have been updated, if any interest you then you could do a traditional apt-get update etc.
It could be extended to hand out individual package descritpions and rebuild the available file to keep than in sync as well, but thats looking a bit far into it at this stage.
I only spent a couple of days on it, but have the code that generates the files about (but the package file is buggy).
I need to finish busybox apt-get first, its been dragging on too long, so i wont doing anything more on this idea for a while, if anyone wants the code i have started let me know.
Much of the code is derived from busybox dpkg which seperates and stores the above data in hashtables… hmm… the version on my hardrive does anyway 🙂 (it still needs work as well)