Images, apt-get clean and friends
wookey at wookware.org
Wed Aug 11 17:32:24 BST 2010
+++ Christian Robottom Reis [2010-08-05 22:28 -0300]:
> Hi there!
> I unpacked our minimal release image and ran an xdiskusage on it,
> mostly to see what we're shipping -- and I was surprised to see that a
> fourth of the image is actually apt package caches and lists.
This is typical for a debian-based minimal system.
Emdebian has spend some time developing tools for minimising installed
size of Debian-compatible images. So I can make a few relevant comments.
> Can we
> put into the image generation script something to strip them out before
> generating the image?
We could. The tradeoff is having to download them again on first use
of apt on the target system vs a smaller installed system until that
is done. In cases where that is 'never' then it's a big win.
Making sure that only repos that are actually needed on the target are
listed can help. Does it need src repos? Does it need
universe/multiverse? leaving those out makes a huge difference.
I assume there are no .debs in the apt cache? debotstrap-based
installers leave all the .debs in because they are needed for
second-stage configuration, but I assume we've done the
second-staging by some means or other. (multistrap-based image
creation does not need the .debs for 'second-stage', so this issue
does not arise).
> The untarring also suggests a number of places where we could further
> trim the image, some of which are probably pretty hard to do:
> * stripping /usr/share/doc out (but everybody knew that)
> * dropping charmaps, zones and locale info that will never really be
> * stripping out modules for devices that won't ever be on
> this ARM device
> * stripping out firmware for peripherals that won't ever be on this
> ARM device
This is pretty close to what emdebian grip does - i.e. the set of easy
wins which approximately halves your base image size without making
any binary-incompatible changes or rebuilding anything. (although
emdebian doesn't do anything about kernels - we've left that as
We could use the em-grip tool (or a variant) to repackage our debs to
make smaller images. However the result is not policy-compliant
'ubuntu', but a new repository of packages containing the exact same
binaries, but less bloat, ontop of which you can install any normal
ubuntu packages which have not had this treatment. That may or may not
be how we want to proceed? It is a sane and effective way to manage
this sort of thing (it is currently trivial to crossgrade Debian to
emdebian-grip and save a load of space, or to use the installer to
install grip instead of normal Debian). We could pull the same trick
for Ubuntu with relatively little effort.
Grip does the following things to compatibly save space:
* Reduce all Long descriptions to 4 lines in packages files (makes them
approc half the size)
* strip other fields that aren't actually needed (including 'recommends')
* strip all docs, examples, manpages, just leaving copyright files
* sets dpkg-vendor so that it can be used to different stuff in
maintainer scripts (or on rebuilds).
* restricts overall archive size to keep apt metadata size down
* remove lintian files, help files
* don't require everything 'essential' so a smaller minimal system can
* split translations out into '.tdebs' - one per lang per package, in
separate pool i.e not like the ubuntu or proposed Debian schemes
* tzdata is one thing we left alone in grip, although it would be
really good to slim it down a bit. In crush it was shrunk to ~1/3rd the
size (2.4MB) by removing the 'right' and 'posix' copies.
Of course another way to achieve much the same effect is to use dpkg
filtering at install time to do the same sorts of stripping. This
means you can leave the package files exactly as they were (and
downloads don't get any smaller, only final images). That has been
implemented as proof of concept a few years back, but there are some
complicated issues about what happens on future upgrades/removals and
exactly how dpkg should deal with operations on installed-but-filtered
If we want to make smaller images we should certainly look at re-using
some of the emdebian technology and/or mechanisms as it already works
Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM
More information about the Linaro-dev