[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Smaller binary size

Subject: Re: Smaller binary size
From: nml <arumakanil@xxxxxxxxx>
Reply-to: myrddin-dev@xxxxxxxxxxxxxx
Date: Fri, 1 Dec 2017 12:02:55 +0800
To: myrddin-dev@xxxxxxxxxxxxxx
On 1 December 2017 at 11:28, Andrew Chambers <andrewchamberss@xxxxxxxxx>
wrote:

> You don't need to split the code into smaller packages for smaller linking
> just into different .o files. Musl libc does this, the init calls are a
> different story though.
>
Hi Andrew,

We can safely remove all the init calls asoocated with it if the package
can be proved to be unused,.
But within a package, we have to keep all init calls (and all their
downstream referenced symbols) even if the involed APIs are not used at
all.
For example, the hello-world doesn't need a DNS resolver but it has to be
kept because we cannot guarrantee if the init calls have any side-effect
and some APIs may depend on that.
That's why I started to think of fine-grained std at first.

The lazy init approach that ori mentiioned earlier is a bit of ad-hoc to my
taste if I understand it correctly (That is, adding them to the entry point
of each API that depends on that init).
It's suitable for some scenarios but not all.
I am still comtemplating for a win-win solution that does not need to break
the std into pieces.



> On Fri, Dec 1, 2017 at 4:17 PM, nml <arumakanil@xxxxxxxxx> wrote:
>
>> Thanks for your comprehensive response!
>>
>> Regarding the big-ish design of std, I think more focused package
>> organization has merits as well besides easing the link-time dead code GC.
>> For one thing, the 'use' statements would be spread among many files when
>> the package is more complex. So it's less likely to be forced into a long
>> 'use' list most of the times.
>> Secondly, the more fine-grained use list give the programmer an overview
>> of what the particular file is about. (Is this file for cli arg parsing?
>> does it involve any IO or networking? does it touch the file system?)
>> Thirdly, it simplifies the naming of APIs. For example, 'std.htput' can
>> be 'ht.put'.
>>
>> I am biased here since I am used to the auto-completion of imports
>> provided by the vim-go plugin. So I have yet to experience the
>> inconvenience of long use/import list.
>> For C, yeah, it's kind of annoying and I have seen people just resort to
>> copying the entire #includes from the other files of the same program.
>>
>> Maybe forbidding unused imports and an import-auto-completion tool be
>> good for Myrlang too?
>>
>>
>>
>> On 1 December 2017 at 11:10, nml <arumakanil@xxxxxxxxx> wrote:
>>
>>> Thanks for your comprehensive response!
>>>
>>> Regarding the big-ish design of std, I think more focused package
>>> organization has merits as well besides easing the link-time dead code GC.
>>> For one thing, the 'use' statements would be spread among many files
>>> when the package is more complex. So it's less likely to be forced into a
>>> long 'use' list most of the times.
>>> Secondly, the more fine-grained use list give the programmer an overview
>>> of what the particular file is about. (Is this file for cli arg parsing?
>>> does it involve any IO or networking? does it touch the file system?)
>>> Thirdly, it simplifies the naming of APIs. For example, 'std.htput' can
>>> be 'ht.put'.
>>>
>>> I am biased here since I am used to the auto-completion of imports
>>> provided by the vim-go plugin. So I have yet to experience the
>>> inconvenience of long use/import list.
>>> For C, yeah, it's kind of annoying and I have seen people just resort to
>>> copying the entire #includes from the other files of the same program.
>>>
>>> Maybe forbidding unused imports and an import-auto-completion tool be
>>> good for Myrlang too?
>>>
>>>
>>>
>>> On 1 December 2017 at 05:28, Ori Bernstein <ori@xxxxxxxxxxxxxx> wrote:
>>>
>>>> On Thu, 30 Nov 2017 21:57:51 +0800
>>>> nml <arumakanil@xxxxxxxxx> wrote:
>>>>
>>>> > Hello everybody,
>>>> >
>>>> > I found that there is some unnecessary code included in the compiled
>>>> > executable. A hello-world program contains the code for a DNS
>>>> resolver, for
>>>> > instance. It's not a big deal in most real-life applications but I'd
>>>> like
>>>> > to investigate the possibility of being bloatedness-free.
>>>>
>>>> A bit of profiling goes a long way. I'd suggest starting with bloaty
>>>> before taking a shot in the dark:
>>>>
>>>> https://github.com/google/bloaty
>>>>
>>>> > As far as I can tell the symbols are included because they are called
>>>> by
>>>> > the __init__ functions of the std package. Do you have any ideas
>>>> about what
>>>> > I can do?
>>>> >
>>>> > What I have come up with so far is to extract some parts of the std
>>>> package
>>>> > into separate packages. Thoughts?
>>>>
>>>> A hello world binary is a bit fat right now, but I'm not too worried if
>>>> it's
>>>> mostly constant overhead. At least, it's not a case worth making the
>>>> libraries
>>>> harder to use.
>>>>
>>>> That said, making it smaller would be good. For specific inits that
>>>> drag in
>>>> a lot, I think we can move some initialization from to a lazy
>>>> initialization
>>>> when we reach any entry point to that code:
>>>>
>>>>         const lazyinit = {
>>>>                 if !initdone    /* skip locking in common case */
>>>>                         lock(netlck)
>>>>                         if !initdone
>>>>                                 init()
>>>>                         ;;
>>>>                         unlock(netlck)
>>>>                 ;;
>>>>         }
>>>>
>>>> Which might also help startup time, and make it saner to reinit things
>>>> on,
>>>> for example, the user editing /etc/hosts.
>>>>
>>>> I think some cleanup of the standard libraries would also be welcome,
>>>> although
>>>> libstd is a bit big by design; I think I'd rather have
>>>>
>>>>         use std
>>>>
>>>> over 9001 'use microlibrary' statements. The hope is that some care,
>>>> thoughts, and taseful use of traits would allow us to make our APIs
>>>> deeper, delete code, and reduce the number of functions needed to
>>>> accomplish things.
>>>>
>>>> > I would also like to know if there is any near-term roadmap or
>>>> milestones
>>>> > or what you guys are working on currently.
>>>>
>>>> There's no grand roadmap, just a bunch of people doing what amuses
>>>> them. I
>>>> don't have the desire or authority to tell people what they should be
>>>> working
>>>> on.
>>>>
>>>> A few things that I'm poking at, though: Some of these actually have
>>>> significant code behind them, others I've just thought about:
>>>>
>>>>         - Actually using Myrddin to write more code.
>>>>                 There's irc.myr, contbuild, but I'd like to actually
>>>> write
>>>>                 more tools. There are a bunch of half-finished things
>>>> which
>>>>                 I need to find some time to finish up.
>>>>
>>>>         - Implementing an assembler and linker.
>>>>                 Cross linking for different architectures painlessly
>>>> would be
>>>>                 nice. As would not depending on the GNU toolchain. I'm
>>>>                 collaborating on this one with the 'scc' (Something C
>>>>                 compiler) people.
>>>>
>>>>         - Switching to QBE (http://c9x.me/compile).
>>>>                 QBE is a nice compiler backend, and I've got it
>>>> partially
>>>>                 working. It needs to become fully working, get ported to
>>>>                 Plan 9, and grow at least line number debug info support
>>>>                 before it can become the default backend.
>>>>
>>>>                 This should also help with code size, since right now we
>>>>                 just generate tons of spare instructions.
>>>>
>>>>         - Autogenerating C bindings.
>>>>                 QC is a very good starting point, but there's a lot to
>>>> do to
>>>>                 in order to make C bindings play well with namespaces.
>>>>
>>>>         - Working on GUI support (X11 + devdraw bindings)
>>>>                 Drawing on the screen is sometimes useful. Doing full
>>>>                 GUI support on Unix will probably involve figuring out
>>>>                 fonts, although as a stopgap I can probalby use the
>>>>                 Plan 9 subfont format with X11.
>>>>
>>>>         - Making libthread not suck.
>>>>                 It particularly sucks on OSX, but it's a little bit bare
>>>>                 even on other systems. Figuring out some higher level
>>>>                 APIs would be a good idea.
>>>>
>>>>         - Expanding libcrypto.
>>>>                 It's missing a large number of virtually essential
>>>> algorithms.
>>>>
>>>>         - And using libcrypto to create libtls.
>>>>                 Essential for what? TLS, which is used for a lot of
>>>>                 communication these days.
>>>>
>>>>         - Implementing more libraries:
>>>>                 - Libmath:      Basic floating point math;
>>>> transcendentals, etc.
>>>>                 - Libflate:     Inflate/deflate/bzip2/... compression.
>>>>
>>>>         - Bootstrapping:
>>>>                 This is low priority, but it will happen at some point.
>>>>                 There's a parser library that needs to learn about type
>>>>                 inference.
>>>>
>>>> --
>>>> Ori Bernstein <ori@xxxxxxxxxxxxxx>
>>>>
>>>
>>>
>>
>
Follow-Ups:
Re: Smaller binary size	Ori Bernstein <ori@xxxxxxxxxxxxxx>
References:
Re: Smaller binary size	nml <arumakanil@xxxxxxxxx>
Re: Smaller binary size	Andrew Chambers <andrewchamberss@xxxxxxxxx>
Prev by Date: Re: Smaller binary size
Next by Date: Re: Smaller binary size
Previous by thread: Re: Smaller binary size
Next by thread: Re: Smaller binary size
Index(es):
- Main
- Thread