<br /> Package lists split per section – Debian Planet

Welcome to Debian Planet

News for Debian. Stuff that *really* matters

Sponsorship

DP is sponsored by Xinit Systems.

Domains paid for and hosted by uklinux.net.

Buy your Debian merchandise at DebianShop.com.

Debian
These are important Debian sites one should not be without!

  • Official Debian site
  • Package search
  • Mailing list archives
  • Bug reports
  • Debian on CD
  • Debian Weekly News — excellent news source!
  • Unofficial APT sources
    (apt-get.org)

  • Developers’ Corner
  • Community
    Need help? You’re not alone on this planet.

  • Planet Debian
  • debianHELP
    (User support site)

  • Debian Administration
    (SysAdmin resources)

  • Debian International
  • DebianForum.de
    (Deutsch)

  • DebianForum.dk
    (Dansk)

  • EsDebian
    (Español)

  • DebianWorld
    (Français)

  • Debian-Fr
    (Français)

  • MaximumDebian
    (Italiano)

  • DebianItalia
    (Italiano)
  • DebianUsers
    (한국어)

  • Debian-BR
    (Português)

  • DebianHOWTO
    (Deutsch)

  • Russian Debian (Русский)
  • Debian-JP
    (日本語)
  • Debian Suisse
    (Suisse)
  • Contribute
    Got that latest or greatest scoop? Perhaps you have some important news for the Debian community? Submit a news item!

    Or perhaps you’ve written a rather ground breaking insight into some aspect of Debian and you feel compelled to share it with others? Knock up a longer editorial article and send it to the editors.

    General feedback should be sent to staff@debianplanet.org

    IRC
    The place to get help on a Debian problem (after reading docs) or to just chat and chill is #debian on irc.oftc.net.

    Many of the Debian Planet staff live there so pop by and say hello.

    Debian Planet also has its own channel on the same network called #debianplanet.

    Donate
    Support Debian through Bytemark Hosting. At least £7 will be given for each new account

    Syndicate
    XML

    Package lists split per section
    Submitted by Anonymous on Thursday, April 04, 2002 – 17:51
    To help reduce bandwidth consumption and time for slow connections splitting up the Packages file for each section like base, net, games etc. This way you could choose which sections you wanted to download descriptions for. For example people running a simple server could remove the source for games and graphics package descriptions and somebody running a desktop for word processing could remove the source for devel and science.

    Control panel

    Comment viewing options:



    Select your prefered way to display the comments and click ‘Update settings’ to activate your changes.

    Subject: Versioned Packages.gz files
    Author: chewie
    Date: Friday, 2002/04/05 – 22:39
    Multiple Packages.gz files does address the issue of being able to exclude large groups of software. I would be surprised if the average workstation didn’t have software installed from 80% of the categories anyway. What exactly would we be giaining here?

    Another, more complicated answer would be to employ a versioned Packages.gz file. Generating the necessary patch files could be stored in a filesystem layout based on the version number. For example, apt would request the Packages.diff.gz file from the directory of its /current/ version (i.e. ftp://ftp.site.tld/debian/dists/woody/i386/main/Packages/20020405001/Packages.diff.gz)

    CVS would not be required for such a system, since the client only needs to know its current file version. The server can generate patches either per request through a scripted backend, or manually as static files (depending on how you want to manage your disc space and how complicated you want to make things.) If the patch exceeds a given percentage of the total size of the Packages.gz file (which may actually be symbolically linked to ./version/Packages.gz), the patch could be excluded, and apt would download the new Packages.gz file in its entirety. The downside of this is the increased maintenance of patch and diff files and ensuring that synchronization is silmultaneous for all patches and new versions of the source file.

    This solution does not address the issue of downloading package descriptions about software you have no interest in. If this was the desired goal of apt clients, then an apt server application would need to be developed so that the client could selectively query about packages instead of relying upon locally cached databases (Packages.gz). Such a setup would migrate away from the KISS philosophy that much of the Debian packaging system embraces.

    [ Please login, or register ]

    Subject: Re: Package lists split per section
    Author: Anonymous
    Date: Friday, 2002/04/05 – 18:15
    CLICK HERE TO WIN!!!!!!!!!!!!!!!!!!!!!!
    [ Please login, or register ]

    Subject: Re: Package lists split per section
    Author: Anonymous
    Date: Friday, 2002/04/05 – 11:51
    For technical reasons I think this idea is not the way to speed up the updates on the Packages file.

    Currently the file is one packages and thus requires only one connection to be set up. Once you’re using multiple packages you’ll have to setup a connection for EVERY package you want to download and, looking at the above proposal, that’s going to be a lot. Not exactly worth the extra overhead this is going to create.

    Furthermore, splitting up the packages file is not going to solve the major cause: many applications have multiple flavours. So even if you don’t use i.e. hammradio you may use browser and you’ll be downloading info about well over dozen different browsers of which one will probably only look at one or two.
    Still a lot of useless info is downloaded.

    Finally, there’s the problem of frequency of updates. This a matter especially true for testing and unstable.
    Many applications don’t change much over time, but every category has a few apps that are updated almost on a daily basis and with this proposal you’ll still generate a lot of useless bandwith-consuming packages-files.

    What might work is only updating those parts of the packages file that have been changed since last it was updated.
    Various ways exist to do this, rsync and diffs on a daily basis are the two most likely suspects to handle this.
    Rsync is, however, an option even worse than this proposal since it will put a major burden on the mirrors due to the cpu-intensive way rsync works.
    Using diffs with a reasonable schedule of when to update the entire-package instead of doing the diff-way is only using up a bit of diskspace.

    The latter methods have also been discussed in response to a posting on debian-devel recently and it seems like the real resolution to this problem will be discussed further once Woody has gone stable.
    See the following URL and its replies:
    http://lists.debian.org/debian-devel/2002/debian-devel-200203/msg01966.html

    Thomas

    [ Please login, or register ]

    Subject: Re: Package lists split per section
    Author: Anonymous
    Date: Friday, 2002/04/05 – 11:41
    My question is maybe stupid, but why isn’t it possible to update the Packages file with rsync? This program is supposed to transfer only the differences between the local file and the remote one, reducing the bandwidth and the time needed to update the packages list… Or maybe it won’t work that efficiently because the Package files are compressed?
    [ Please login, or register ]

     

    Subject: Re: Package lists split per section
    Author: Anonymous
    Date: Friday, 2002/04/05 – 11:54
    As I said in my posting below: it eats up processing power of the server.And I don’t think many admins of the mirrors would like that.
    [ Please login, or register ]

    Subject: Metadata Client/Server
    Author: Anonymous
    Date: Friday, 2002/04/05 – 05:52
    This problem has been discussed a lot, and think most proposed solutions are bad as they dont really solve the problem of unrequired information being sent.

    This solution is marginally better, but its still sendign lots of unneeded information in each section.

    The real problem is that people dont want the Packages.gz file to get a list of all the metadata of all the packages in a dist. People want the Packages.gz to know which packages are updated, or to choose wether to download/install a package.

    I now think the best solution is to create a client/server that can be queried remotely to spit out metadata as requested.

    e.g. If you want to know the latest versions of all packages, then send it a request for all package names and versions, and recieve only that

    If you want to know the complete entry for a package, or all packages in a section, or the package version in different dists then request and recieve only that.

    The servers that send metadata should NOT aim at send any binary or source packages, just metadata, they are two different jobs, and trying to serve binary or source packages in this way would just mean less people would run such a server.

    Maybe if there were enough of these servers they could be used in a distributed way to identify the best mirror to download the actual package from.

    [ Please login, or register ]

     

    Subject: Re: Metadata Client/Server
    Author: jbert
    Date: Friday, 2002/04/05 – 09:44
    Hmm. You could do this in LDAP in a fairly straightforward way.

    Plusses:

    – LDAP is a hierarchical DB, so you can split different distributions (and sections within a distribution) into a nice tree format. When the client runs a search, you can base it anywhere in the tree and just search the info you want.

    – openldap free server

    – Good language bindings from C, perl, java and probably others

    – Client can search on “last modified time”. This allows each updating client to request exactly the info wanted (i.e., which package entries have changed since time X, in dist Y in sections A, B, C)

    Minusses:

    – Some people don’t like it.

    – Requires running LDAP server on participating mirrors. (Although…could you run the LDAP query against a master server and then fetch the packages from a local server?)

    – Requires more glue code to keep things in synch

    Would anyone like more information? To whom should I speak to take this forward?

    [ Please login, or register ]

     

    Subject: Re: Metadata Client/Server
    Author: Anonymous
    Date: Friday, 2002/04/05 – 13:36
    I just posted a message about this idea to debian-devel mailing list.
    [ Please login, or register ]

    Subject: Re: Package lists split per section
    Author: MBCook
    Date: Friday, 2002/04/05 – 04:11
    I would have to agree with this. The fact is, more than 90% of each apt-get update I do is the testing/main Packages file. So even with multiple servers, only so much ever happens. Splitting into multiple files would REALLY help.
    [ Please login, or register ]

    Subject: Good start, but split on frequence too
    Author: Anonymous
    Date: Thursday, 2002/04/04 – 20:32
    The suggested modification makes sense to me, but to REALLY cut back on bandwidth and processing time, split the package files on modification frequency. We already do this with stable/testing/unstable, of course. I’m suggesting a finer granularity, and weekly/mothly comes to mind. There are a lot of packages which change frequently which cause package information for other packages to be re-downloaded.

    Not knowing much about the internals, it seems like this could be made transparent to the user. Have Packages.*.gz replace Packages.gz, and then add a suplemental file to list what goes in the *. That would also mean that new package information could be added without re-downloading the old files by adding another Package file. If new files obsoleted old ones, then the weekly (or daily?) package files could include changes to packages listed in the monthly files:

    Packages.gz – released package list
    Packages.m04y02.gz – april’s montly package list
    Packages.w14y02.gz – this week’s weekly package file
    Packages.d094y02.gz – today’s daily package file

    This might also lead to speeding up package tools by reducing redundant processing.

    [ Please login, or register ]

     

    Subject: Re: Good start, but split on frequence too
    Author: Anonymous
    Date: Friday, 2002/04/05 – 11:51
    I don’t see the interest of having weekly updated package files instead of updating the current files weekly, what’s the point ?
    [ Please login, or register ]

     

    Subject: The advantage is less redundancy
    Author: Crag
    Date: Saturday, 2002/04/06 – 02:16
    The current file doesn’t change completely every week. The difference in the file from week to week is (I assume) much smaller than the entire file. By only sending the changed package information, apt-get update will be much faster on slow connections.

    So, the point is decreased bandwidth usage.

    [ Please login, or register ]

     

    Subject: Re: Good start, but split on frequence too
    Author: Anonymous
    Date: Thursday, 2002/04/04 – 23:20
    This is also a good idea, as was the rsynch idea but it would require altering apt-get to fetch the different file options. Also the server would have to keep multiple versions of the packages to match each description. Simply making a seperate package for each section wouldn’t require changing anything, it would just mean adding more directories with smaller package description files and the current all-in-one package file could still be kept unchanged. Anybody running a mirror of the packages could also do this unofficially and it wouldn’t change anything either.
    [ Please login, or register ]

     

    Subject: Mmmmmm… rsync
    Author: Anonymous
    Date: Friday, 2002/04/05 – 04:58
    I know this has probably been hashed over time and time again, but I have to say it. I’d love it if the package file (whether it’s one big one or one for each section) used rsync to update. Being on a dialup connection, it would be nice to be able to update in under a minute rather than having to wait 8 minutes just for the package file(s) update.

    Being a desktop user (and also a newbie programmer of less than 6 months) I like having all the packages listed. (sources, devels, games, graphics, X, etc) I like seeing what’s out there. Perhaps I’m a rare breed, but for me, if I were given the choice between the two, I’d ask more for rsync than seperate package files. I know… no one is really posing a choice between the two here, but I just wanted to voice an opinion. I love apt. But the lengthy download of just the packages file(s) is my only annoyance with the system.

    Not trying to offend,
    Jeremy

    [ Please login, or register ]

     

    Subject: Re: Mmmmmm… rsync
    Author: cef
    Date: Friday, 2002/04/05 – 10:49
    There were a large number of issues (legal and code) with rsync (they may be over with now), and if you bugged Tridge (or Rusty) about rsync at all (or apt-proxy) you’d hear the whole deal.

    From what I remember, the legal issue has to do with a company owning a patent on making dial-up modem connections faster by using a diff-like method of comparision to only send changes over the line. Of course, this is practically what rsync does. And could you see anyone making rsync not work over modem connections? Apart from how you could do it (which would be rather hard), it’s like a punch in the face to modem users. “Oh yes we can save you heaps of bandwidth and make it faster, but only if you already have enough bandwidth that it probably won’t matter!”

    The end result is that gzip rsyncable is coming or has already arrived (which allows gzip files created in a special way to work with rsync), so it’s just a matter of when. The rsync issues are not as easy explained, but it also looks like this is on the way. Until all the issues are out of the way, I wouldn’t expect to see apt use it.

    Remember: Debian is slow on the uptake with some things for a reason. They took ages to get into the process of moving the crypto stuff out of non-US and into main, to make sure they wouldn’t get bitten. They sought legal advice too (thanx for paying that HP!) before they did it to make sure they didn’t put their feet in hot water.

    PS: IANAL & IANROT – You want to know the details, ask Rusty or Tridge. Expect the following sort of response: “*sigh* Not this again!”

    [ Please login, or register ]

     

    Subject: Re: Mmmmmm… rsync
    Author: Anonymous
    Date: Friday, 2002/04/05 – 18:45
    Ah, I believe I understand. I appreciate your response on it.

    Expect the following sort of response: “*sigh* Not this again!”

    heh… perhaps I shouldn’t ask. Just knowing your explanation is good enough. Thanks again.

    Jeremy

    [ Please login, or register ]

     

    Subject: Modify apt? The horrors!
    Author: Crag
    Date: Friday, 2002/04/05 – 03:03
    apt has been under heavy development for how long? And the idea of modifying it to fetch a list of files to fetch is bad…how?

    The server wouldn’t have to keep multiple package versions because people wouldn’t _not_ fetch a particular package file. There wouldn’t be anyone grabbing just the monthly files. There’s no reason _not_ to grab the updated package information, and I don’t even think it should be an option. Just modify apt so that instead of getting Packages.gz it gets Packages-list and then fetches all the files listed by Packages-list. The addeed overhead would then be the cost of fetching and reading the list of package files and of reading redudant package information which has been overriden.

    This is like distributing source patches in addition to tarballs.

    I wouldn’t do this as an alternative to the sectional split, either. I’d would be done transparently in addition to per-section package lists. Also, if the PackageList file (or whatever it’s called) isn’t present, apt could fall back on the old behavior.

    So, no need to keep old packages around, no need to worry about dependancy problems caused by multiple package list versions since there would still only be one official version (the combination of all package files).

    Another possibility would be supplying the differential package lists as actual diffs. That might be more efficient if the only thing that changes in most packages is the version and depends.

    And yes, that anonymous poster was me. I wasn’t logged in due in part to DebianPlanet not letting me change my password.

    [ Please login, or register ]

     

    Subject: Re: Good start, but split on frequence too
    Author: noviota
    Date: Friday, 2002/04/05 – 02:38
    I actually fully agree with splitting them by group, not by frequency. A problem I can see occuring is dependencies. IE dev, libs would always have to be got. But perhaps X, net, mail, games, etc could save bandwidth, memory etc and also time. It would be great not to have the games and X section listed on my firewall or intranet web server.
    [ Please login, or register ]

     

    Subject: Re: Good start, but split on frequence too
    Author: Anonymous
    Date: Friday, 2002/04/05 – 02:52
    But then if you wanted to install one game, you’d need to add another package file. What would be really nice is to be able to update the packages based on section, e.g. ‘apt-get update base’
    [ Please login, or register ]

    Search articles



    Category
    ·News (346)
    ·Features (5)
    ·Site News (14)
    ·HOWTOs (63)
    ·Tips (17)
    ·Opinion (26)
    ·Q & A (27)
    ·Sponsorship (1)
    ·Press Releases (2)

    Log in
    Username:

    Password:

    Remember me

    » Register
    » New password

    Debian Security Announcements
    DSA-721 squid
    DSA-720 smartlist
    DSA-719 prozilla
    DSA-718 ethereal
    DSA-717 lsh-utils
    DSA-716 gaim
    DSA-715 cvs
    DSA-714 kdelibs
    DSA-713 junkbuster
    DSA-661 f2c

    Planet Debian
    Jurij Smakov: Cool demos
    Ian Murdock: Almost famous
    Nico Golde: New debian PL
    John Goerzen: Camping
    David Nusinow: Science And X.org
    Amaya Rodrigo: Housespotting
    Ian Murdock: r0ml has a bl0g
    Ross Burton: Devil’s Pie “Can’t Join Them? Beat Them” 0.10
    Benjamin Mako Hill: No Wait
    Ross Burton: Holy Cow

    Debian Administration
    Removing unnecessary packages with deborphan
    Question: Manage updates of more then one machine?
    Understanding large source code with gonzui
    Debian Admininistration Site Update
    Question: Share your bash tips?
    Speeding up recompilation with ccache
    Testing network connectivity
    Automounting card readers and USB keys using autofs
    Card Readers and USB keys using udev
    Building Debian CD-ROMS Part 1 – dfsbuild

    Latest poll: Which release scheme should Debian follow?
    Continue this way (release when ready)
    48%
     
    Give up on releasing
    8%
       
    Split the release up
    8%
       
    Speed the release up
    32%
       
    Crank the workload up (see DebianWiki ReleaseProposals for details on these)
    4%
       

    Total votes: 372
    0 comments · older polls

    home · archives · news feeds · about · polls · search · sections · user account

    Powered by the amazing Drupal

    Debian Planet is not officially related to the Debian Project.
    Debian and the Debian logo are trademarks of Software in the Public Interest, Inc.