Written by ravirdv the 28 Feb 08 at 14:31.
Category: System.
Related project:
Nothing/Others.
Status: New
Rationale
Summary:
Ability to download only changed bits of files and use much less bandwidth.
Scope and Use Cases:
Ann has slow internet connection. She sees that there are 150MB of updates and decides not to update at all leaving her with vulnerable and buggy system.
Implementation Plan:
Adopt it from Debian?
I think one of the Debian guys was working on this system a few years back. It would be a fantastic addition. I think most of the other OS players use something similar.
It's not just download time, but a lot of pepole around the globe still have a download limit. Currently a lot of South African providers have limit between 2gb to 6gb. If more 10% of your download goes to updates, you really think about skipping them.
Sure, and it would be very good to have an intuitive and easy way for offline update. Sometimes you don't have Internet at all, and you cannot even make a package database update...
I think of this: Synaptic generates a Python script. Running the script on other machine with Internet access (Windows with OpenOffice with built-in Python; or other Linux machine with Python installed) would download all needed packages into a singles direcotry. Than you just show this directory to Synaptic and it installs the packages, and also updates the package database information.
i think this is really usefull for everyone, and i also think that there should be an option to use p2p tecnlogy (like kadmelia or whatelse) to get those delta updates.
Perhaps rather than a diff between version X and version Y this would be a better use for Par2 which can use any repair block to fill in any other block of changed data. (And par3 whenever it comes out)
If disk space on the server isn't too much an issue it could work like so:
1) make a 100% par2 set on the patch server (this is faster in an extreme fashion with par3 (100x), par2 also has a limitation of 32767 blocks where par3 would have one much higher. Given patches I see tend to top-end at the hundreds of MB this probably wouldn't be a concern in block-size.)
2) Retrieve .par2 from patch server
3) Run .par2 against a script-given directory (directories would need to be implemented)
4) request as many blocks as are needed to repair all blocks of 'damage' (this process of repair would be faster with par3 in some situations)
As long as the tool being patched isn't intolerant of being reset completely to default any version could be reverted to any other version currently stored on the patch server. I imagine it wouldn't be terribly impossible to script something to handle the reverse direction if necessary to revert something everywhere at once in userland.
It's worth noting that tampering with the par2 data breaks the math function and will fail the repair. I suppose one could alter the par2 data entirely but it would require re-paring the entire set as no one repair block has any direct bearing on a particular section of data. though it's not exactly hard to find a hash that would work if the security folks want to be paranoid. (Which they probably would to detect tampering before the parity process. heh)
I'm no OS coder but that's my two cents on the matter of improved choice of patch algorithm.
If implemented, it will be great boost for Ubuntu success over the many counties around.
Russia is one of examples: we still have very limited access to the Internet over the country. Slow and expensive.
Delta updates will save money, time and server resources.
I was testing...
Unpacking previous and updated debs and performing bsdiff on files show differences about 1% of package size or less for larger ones!!!
One of previous minor updates of OOo (~100 MiB or such), turns in 10-20 KiB of binary diff. This could be fantastic savings...
Hrmm..the idea is excellent many(most?) of the Linux users comes from India and other places where Internet speed and BW is a big reason reg apt updates.
BTW,Ubuntu got these vast resources,then why cant they use conary package manager used in Foresight Linux etc?AFAIK Conary can update using the changed file portions etc :)
One thing I will NOT tolerate is Windows-style annoyances like having to update a particular program 3 times in seperate sessions, instead of handing me the most up-to-date version at once. (Bonus irritation points for forcing reboot between these 'old' patches, too)
Geez, I actually told people that this was a benefit of Ubuntu, thinking it was already implemented. I thought APT/.deb would read a patch file between it's current install and the new download and then only download those lines.
Ex: OpenOffice doesn't re-download it's 100MB install, rather only the changed code.
In addition to the cool factor, there is a strong business case to implement delta updates.
Firstly, one of Linux's best growth opportunities has been in 3rd world countries, many of which have much less access to the Internet or it's more costly. Automatic delta based updates would drive adoption in these areas.
I already saw South Africa and Russia mentioned in the comments above.
Secondly and most obviously, it cuts down on bandwidth at Canonical.
Thirdly, it cuts down on bandwidth chez moi.
Fourthly, since delta base updates require the end user to store file on their side, it makes it more practical if Canonical or end users wanted to distribute updates over P2P.
You could imagine other senarios also, such as bandwidth savings for corporate install servers.
I thought there were some interesting legal issues around some kinds of delta patching.
I remember an MMORPG project I was involved with a while back that decided to do it regardless because of the efficiency, but someone (Microsoft, I think) had the idea patented with the Windows Update jazz.
I'm not really sure if this is the case/has changed, but it's worth double checking. Software patents are a pain.
That's what i miss in ubuntu too. Sometimes i can only use a dialup modem connections to get into the net. Delta patches would be the best solution for me to keep my system up to date.
By the way: the fact that SuSE had this feature and ubuntu hasn't was the reason why i waited so long to switch to ubuntu. But now i'm very happy with it...
The best idea on brainstorm. It'll affect everybody saving bandwith, and will be a life-changing for users from South America, Africa, and so on...
I can't even imagine a single reason to reject this idea.
Great idea and actually not that hard to implement.
Did some testing on my own on packages like wine where there is currently a lot of development you only save a about 50% bandwidth.
But on packages like the Linux kernel and libc updates I was able to update from the original release to the current release in about 10seconds for libc and 12s for the kernel using standard ASL line.
On my system it takes longer to install the updates than to download them, since I'm at college and have a local high-speed mirror available. I think a lot of the files are just being overwritten with copies of themselves, however so Delta installation as well as downloading would be nice.
I like the idea of incremental updates. We have a cap on bandwidth and installing a totally new version of for instance openoffice just for a minor update seems a bit of a waste.
Perfect for South Africa, we have very restricted internet in terms of bandwidth. It is expensive. One of the features that openSUSE has (delta RPM's) that I wish ubuntu had.
This is an excellent idea, and fixes the core problem of many recent ideas, as well as lessening the impact of some problems (such as servers being slow during upgrade time)
People complaining that their internet is slow whilst downloading updates for instance, would be a lot happier automatically if they only had to download 3MB instead of 50MB
Personally I think their are 2 separate issues here.
1) updates for features that are not relevant
2) minor updates for features that are relevant (eg developer versions being tested)
In the case of (1) the problem could also be addressed by doing greater system profiling and only updating for systems that will benefit. eg in the linked #11000 idea the complaint is that "* Fix broadcom Makefile" produces a large download and the user doesn't have any broadcom stuff. So if the package management system were cleverer it would know not to download that upgrade because the system has no broadcom hardware.
This is a substantially larger alteration than that needed for (2), namely the previously widely touted binary diff/patch.
I don't understand why this is still unimplemented?! Are we being ignored? It's on launchpad and ubuntu forum for like 2 years now?! I don't believe it's so hard to implement.
Even though I like this idea, I think that getting it to work will be tricky since it would have to be changed upstream as well (as in, with Debian and other Debian-based distro's). And getting that through will take time, patience, blood, sweat, a sore throat (interpret that however you like ;) and coding skills.
Someone designed a delta system for Gentoo that worked automatically. You'd use a special server set up that would mirror the updates, it would work out the diff and only submit the changes between the cached copy of it on your own machine (it only worked when you had a previous downloaded copy). And it would cache the diff.
With Ubuntu its even easier, because updates get released on gentoo faster then you can compile the last version.
A friend recently pointed out to me that there are pretty much only 2 kinds of people: those that update their computers regularly/daily, and those who never update.
Patches would be great for those who update regularly, but full package updates can be used if the user is more than 1-3 version updates behind.
4 consecutive patches is less bandwidth effective.
$ sudo apt-get install gnome-backgrounds
The following NEW packages will be installed:
gnome-backgrounds
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 9782kB of archives.
.........................................
Setting up gnome-backgrounds (2.24.0-0ubuntu1) ...
Besides, the packages are not binaries; they are a collection of many binary files. In most upgrades, many of these files are not changed, so even a diff that includes only the changed files would be an improvement.
I've got about 900 megs of games installed from Synaptic. I'd love the updates to be patch-based so I'm not downloading 600megs of duplicate fluff every release cycle!!
And just think how much easier it would be on the mirrors.
Hey, I know! We should create ANOTHER package distribution format for distributing delta updates. Having two competing formats just isn't enough. Linux is about choice, right?
I'm using both Fedora and Kubuntu and have just updated my Fedora while using presto yum plugin. It downloaded 1.9 MiB of deltas instead of 49 MiB of full updates which is a 96% bandwidth saving. This was the biggest saving I've ever encoutered - ususally it's about 80-85%.
It would be great if I could update my Kubuntu in the same way I upgraded my Fedora, especially because my Kubuntu is behind a slower connection to the Internet with the monthly bandwith limit of 1 GiB.