[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Smaller binary size
- Subject: Re: Smaller binary size
- From: nml <arumakanil@xxxxxxxxx>
- Reply-to: myrddin-dev@xxxxxxxxxxxxxx
- Date: Fri, 1 Dec 2017 12:02:55 +0800
- To: myrddin-dev@xxxxxxxxxxxxxx
On 1 December 2017 at 11:28, Andrew Chambers <andrewchamberss@xxxxxxxxx>
wrote:
> You don't need to split the code into smaller packages for smaller linking
> just into different .o files. Musl libc does this, the init calls are a
> different story though.
>
Hi Andrew,
We can safely remove all the init calls asoocated with it if the package
can be proved to be unused,.
But within a package, we have to keep all init calls (and all their
downstream referenced symbols) even if the involed APIs are not used at
all.
For example, the hello-world doesn't need a DNS resolver but it has to be
kept because we cannot guarrantee if the init calls have any side-effect
and some APIs may depend on that.
That's why I started to think of fine-grained std at first.
The lazy init approach that ori mentiioned earlier is a bit of ad-hoc to my
taste if I understand it correctly (That is, adding them to the entry point
of each API that depends on that init).
It's suitable for some scenarios but not all.
I am still comtemplating for a win-win solution that does not need to break
the std into pieces.
> On Fri, Dec 1, 2017 at 4:17 PM, nml <arumakanil@xxxxxxxxx> wrote:
>
>> Thanks for your comprehensive response!
>>
>> Regarding the big-ish design of std, I think more focused package
>> organization has merits as well besides easing the link-time dead code GC.
>> For one thing, the 'use' statements would be spread among many files when
>> the package is more complex. So it's less likely to be forced into a long
>> 'use' list most of the times.
>> Secondly, the more fine-grained use list give the programmer an overview
>> of what the particular file is about. (Is this file for cli arg parsing?
>> does it involve any IO or networking? does it touch the file system?)
>> Thirdly, it simplifies the naming of APIs. For example, 'std.htput' can
>> be 'ht.put'.
>>
>> I am biased here since I am used to the auto-completion of imports
>> provided by the vim-go plugin. So I have yet to experience the
>> inconvenience of long use/import list.
>> For C, yeah, it's kind of annoying and I have seen people just resort to
>> copying the entire #includes from the other files of the same program.
>>
>> Maybe forbidding unused imports and an import-auto-completion tool be
>> good for Myrlang too?
>>
>>
>>
>> On 1 December 2017 at 11:10, nml <arumakanil@xxxxxxxxx> wrote:
>>
>>> Thanks for your comprehensive response!
>>>
>>> Regarding the big-ish design of std, I think more focused package
>>> organization has merits as well besides easing the link-time dead code GC.
>>> For one thing, the 'use' statements would be spread among many files
>>> when the package is more complex. So it's less likely to be forced into a
>>> long 'use' list most of the times.
>>> Secondly, the more fine-grained use list give the programmer an overview
>>> of what the particular file is about. (Is this file for cli arg parsing?
>>> does it involve any IO or networking? does it touch the file system?)
>>> Thirdly, it simplifies the naming of APIs. For example, 'std.htput' can
>>> be 'ht.put'.
>>>
>>> I am biased here since I am used to the auto-completion of imports
>>> provided by the vim-go plugin. So I have yet to experience the
>>> inconvenience of long use/import list.
>>> For C, yeah, it's kind of annoying and I have seen people just resort to
>>> copying the entire #includes from the other files of the same program.
>>>
>>> Maybe forbidding unused imports and an import-auto-completion tool be
>>> good for Myrlang too?
>>>
>>>
>>>
>>> On 1 December 2017 at 05:28, Ori Bernstein <ori@xxxxxxxxxxxxxx> wrote:
>>>
>>>> On Thu, 30 Nov 2017 21:57:51 +0800
>>>> nml <arumakanil@xxxxxxxxx> wrote:
>>>>
>>>> > Hello everybody,
>>>> >
>>>> > I found that there is some unnecessary code included in the compiled
>>>> > executable. A hello-world program contains the code for a DNS
>>>> resolver, for
>>>> > instance. It's not a big deal in most real-life applications but I'd
>>>> like
>>>> > to investigate the possibility of being bloatedness-free.
>>>>
>>>> A bit of profiling goes a long way. I'd suggest starting with bloaty
>>>> before taking a shot in the dark:
>>>>
>>>> https://github.com/google/bloaty
>>>>
>>>> > As far as I can tell the symbols are included because they are called
>>>> by
>>>> > the __init__ functions of the std package. Do you have any ideas
>>>> about what
>>>> > I can do?
>>>> >
>>>> > What I have come up with so far is to extract some parts of the std
>>>> package
>>>> > into separate packages. Thoughts?
>>>>
>>>> A hello world binary is a bit fat right now, but I'm not too worried if
>>>> it's
>>>> mostly constant overhead. At least, it's not a case worth making the
>>>> libraries
>>>> harder to use.
>>>>
>>>> That said, making it smaller would be good. For specific inits that
>>>> drag in
>>>> a lot, I think we can move some initialization from to a lazy
>>>> initialization
>>>> when we reach any entry point to that code:
>>>>
>>>> const lazyinit = {
>>>> if !initdone /* skip locking in common case */
>>>> lock(netlck)
>>>> if !initdone
>>>> init()
>>>> ;;
>>>> unlock(netlck)
>>>> ;;
>>>> }
>>>>
>>>> Which might also help startup time, and make it saner to reinit things
>>>> on,
>>>> for example, the user editing /etc/hosts.
>>>>
>>>> I think some cleanup of the standard libraries would also be welcome,
>>>> although
>>>> libstd is a bit big by design; I think I'd rather have
>>>>
>>>> use std
>>>>
>>>> over 9001 'use microlibrary' statements. The hope is that some care,
>>>> thoughts, and taseful use of traits would allow us to make our APIs
>>>> deeper, delete code, and reduce the number of functions needed to
>>>> accomplish things.
>>>>
>>>> > I would also like to know if there is any near-term roadmap or
>>>> milestones
>>>> > or what you guys are working on currently.
>>>>
>>>> There's no grand roadmap, just a bunch of people doing what amuses
>>>> them. I
>>>> don't have the desire or authority to tell people what they should be
>>>> working
>>>> on.
>>>>
>>>> A few things that I'm poking at, though: Some of these actually have
>>>> significant code behind them, others I've just thought about:
>>>>
>>>> - Actually using Myrddin to write more code.
>>>> There's irc.myr, contbuild, but I'd like to actually
>>>> write
>>>> more tools. There are a bunch of half-finished things
>>>> which
>>>> I need to find some time to finish up.
>>>>
>>>> - Implementing an assembler and linker.
>>>> Cross linking for different architectures painlessly
>>>> would be
>>>> nice. As would not depending on the GNU toolchain. I'm
>>>> collaborating on this one with the 'scc' (Something C
>>>> compiler) people.
>>>>
>>>> - Switching to QBE (http://c9x.me/compile).
>>>> QBE is a nice compiler backend, and I've got it
>>>> partially
>>>> working. It needs to become fully working, get ported to
>>>> Plan 9, and grow at least line number debug info support
>>>> before it can become the default backend.
>>>>
>>>> This should also help with code size, since right now we
>>>> just generate tons of spare instructions.
>>>>
>>>> - Autogenerating C bindings.
>>>> QC is a very good starting point, but there's a lot to
>>>> do to
>>>> in order to make C bindings play well with namespaces.
>>>>
>>>> - Working on GUI support (X11 + devdraw bindings)
>>>> Drawing on the screen is sometimes useful. Doing full
>>>> GUI support on Unix will probably involve figuring out
>>>> fonts, although as a stopgap I can probalby use the
>>>> Plan 9 subfont format with X11.
>>>>
>>>> - Making libthread not suck.
>>>> It particularly sucks on OSX, but it's a little bit bare
>>>> even on other systems. Figuring out some higher level
>>>> APIs would be a good idea.
>>>>
>>>> - Expanding libcrypto.
>>>> It's missing a large number of virtually essential
>>>> algorithms.
>>>>
>>>> - And using libcrypto to create libtls.
>>>> Essential for what? TLS, which is used for a lot of
>>>> communication these days.
>>>>
>>>> - Implementing more libraries:
>>>> - Libmath: Basic floating point math;
>>>> transcendentals, etc.
>>>> - Libflate: Inflate/deflate/bzip2/... compression.
>>>>
>>>> - Bootstrapping:
>>>> This is low priority, but it will happen at some point.
>>>> There's a parser library that needs to learn about type
>>>> inference.
>>>>
>>>> --
>>>> Ori Bernstein <ori@xxxxxxxxxxxxxx>
>>>>
>>>
>>>
>>
>