Hello World!
News on the GSOC project! I've been able to get "Hello world!" working!
Here are my notes on the process of getting it to this point.
This is essentially the approach I used for my first attempt, with a few tweaks. As pointed out by nine, it currently isn't possible to directly call moar and have it execute MBC. If you try, you'll be greeted with this error message:
Nine mentioned that there is good deal of necessary setup that is done when perl6 is called, including doing things like identifying the library search and repository paths to all the needed components. That is why standard users should call perl6, not moar, and why calling moar to directly execute bytecode isn't currently an option.
This week, I wanted to see if it was even possible to embed a Perl 6 program in a homemade ELF file and successfully run that program out of the ELF file. So I decided to follow the outline of mornfall's suggestion, but to use a text file containing source code instead of MBC and to have my C program invoke perl6 instead of moar. I figured that once that basic proof of concept was finished, I would have made enough progress to be able attempt nine's suggestion of isolating the initialization components of perl6's startup script and have something to test with.
Here are my notes on the process of getting it to this point.
Figuring Out the Right Approach
On Sunday, I had a discussion on #moarvm with mornfall, nine, and brrt about the path forward I suggested in last week's update. Mornfall pointed out that replacing _start would likely be way more headache than it was worth. They instead suggested using objcopy to convert a text file containing the MBC bytecode into a .o file (also called relocatable ELF file), and then writing a small C program which invokes MoarVM using the bytecode packaged in the .o file.This is essentially the approach I used for my first attempt, with a few tweaks. As pointed out by nine, it currently isn't possible to directly call moar and have it execute MBC. If you try, you'll be greeted with this error message:
Nine mentioned that there is good deal of necessary setup that is done when perl6 is called, including doing things like identifying the library search and repository paths to all the needed components. That is why standard users should call perl6, not moar, and why calling moar to directly execute bytecode isn't currently an option.
This week, I wanted to see if it was even possible to embed a Perl 6 program in a homemade ELF file and successfully run that program out of the ELF file. So I decided to follow the outline of mornfall's suggestion, but to use a text file containing source code instead of MBC and to have my C program invoke perl6 instead of moar. I figured that once that basic proof of concept was finished, I would have made enough progress to be able attempt nine's suggestion of isolating the initialization components of perl6's startup script and have something to test with.
The First Attempt
This is probably the hackiest approach possible. Instead of attempting to write my own ELF file, I followed this answer on StackOverflow to use objcopy to create an appropriate .o file (relocatable ELF file). The same answer also provided a stub script in C for accessing the text contained in the ELF. The stub script needed to be changed a little to fit my needs. For starters, I had to change the way data_size was being computed since the value the script originally populated it with was the same as the pointer _binary_test_txt_end. I also changed it to call perl6 using the data as the argument instead of simply printing out the data char by char.
The resulting attempt2.c is fairly simple. It starts off by declaring two symbols _binary_test_txt_start and _binary_test_txt_end, whose values are contained in the hello_world ELF. It then calls the user provided perl6 with the text of the program contained in the ELF file.
The resulting attempt2.c is fairly simple. It starts off by declaring two symbols _binary_test_txt_start and _binary_test_txt_end, whose values are contained in the hello_world ELF. It then calls the user provided perl6 with the text of the program contained in the ELF file.
Once those tweaks were made, I was able to get the output "Hello world!" by doing the following.
- Make a text file, test.txt, containing the desired Perl 6 program.
- Call objcopy.
- Link the resulting ELF file and the stub C script, attempt2.c using GCC.
- Call the resulting a.out executable
Those steps look like this when run:
The Second Attempt
With that working, I wanted to modify my earlier attempts at creating an executable ELF to make the needed relocatable ELF (.o file), rather than relying on objcopy. To figure out what I needed to change, I called readelf -a test.o. This showed that I needed to make a couple minor changes to the header and to add & remove several sections and section headers. One of the sections that needed to be added was .symtab, which required a bit of learning on how to format the symbol table, but thankfully, I had a functional reference as well as this guide.
test.o's section header table:
test.o's symbol table:
Once those changes to main.c had been made, I was able to get the output "Hello world!" by doing the following:
- Make a file containing the desired Perl 6 program. This can have any name, and can have any file extension valid for a file containing text (txt, pl, pl6...)
- Call ./linker <your_file>
- Link the resulting ELF file and the stub C script, attempt2.c using GCC.
- Call the resulting a.out executable
Those steps look like this when run:
So at this point, I am able to build a relocatable ELF file which has a Perl 6 program embedded in it which can be used to execute that embedded program!
The Path Forward
Now that I know it's possible to embed source code in a relocatable ELF file and execute that program, I can move on to taking bytecode for a program, embedding it in the ELF file, and executing that bytecode with a call to MoarVM. In order to accomplish that, there a couple steps forward that need to be made.
- The initialization part of perl6's setup script needs to be isolated so that it can be compiled down to MBC and tacked on at the beginning of whatever program is being made into an executable.
- I need to actually tack that bytecode unto the beginning of the program being made into an executable. (This is should be pretty straightforward :P)
- I need to be able to get "Hello world!" as output when using a file containing bytecode.
- I want to see if I can reduce the build process from having to call ./linker to build the ELF and then gcc to link that ELF to the C program that allows me to run the contained code to a single call. This is going to require doing that linking myself, which may prove to be fairly complicated.
- I want to make executing it simpler. Ideally, all I would have to do is call hello_world or foo or whatever the name of the compiled program is, and have it run. This may also prove to be a bit complicated.
I also need to do some code cleanup and rename the files to be more logical things (attempt2.c isn't exactly intuitive, and linker isn't totally accurate at the moment). This is not meant to be a polished thing at this stage, and is very much so a work in progress.
With the rest of this week, I'm going to try to isolate the initialization section and do some code cleanup. If luck's with me, I'll hopefully have another post by the end of the week with an update on my progress on separating that part of the set-up script!
Madeline: Excellent progress for the week! Nice work!
ReplyDeleteThanks for posting in detail about what you've thought and tried and discussed. The irc exchange was a delight to read. Your approach sounds spot on -- having a good time, continuing to try the simplest thing that might possibly work, making progress you can have fun blogging about, and oscillating between being outside your comfort zone and back in again to maximize learning.
ReplyDelete(On the learning front I trust you are making sure to sleep well and long because, according to a brain science book my sister's a fan of, that's apparently the #1 thing for most effectively consolidating in your own brain and mind your long term constructive cognitive and emotional takeaway from a day's activity. :))
At line 20 you initialize data_size with the size of test.txt.
ReplyDeleteAt line 22, command variable point to a buffer of data_size chars, but at line 23, you set command to: "~/sandbox/perl6_2/bin/perl6" + " -e '" + test.txt + "'" + \0
o... you will need more space...
Whoops, thank you for catching that! I'll fix that now.
Delete