# mkg3a

Casio’s FX-CG, or Prizm, is a rather interesting device, and the programmers over on Cemetech seem to have found it worthwhile to make the Prizm do their bidding in software.

The Prizm device itself is based around some sort of SuperH core, identified at times in the system software as a SH7305 a “SH7780 or thereabouts”. The 7780 is not an exact device, though, and it’s likely a licensed SH4 core in a Casio ASIC. Whatever the case, GCC targeted for sh and compiling without the FPU (-m4a-nofpu) and in big-endian mode (-mb) seems to work on the hardware provided.

Between Jonimus and myself (with input from other users on what configurations will work), we’ve assembled a GCC-based toolchain targeting the Prizm. Jon put together a cross-compiler for sh with some supporting scripts, while I contributed a linker script and runtime initialization routine (crt0), both of which were adapted from Kristaba’s work.

With that, we can build binaries targetting sh and linked such that they’ll run on the Prizm, but that alone isn’t very useful. Jon also created libfxcg, a library providing access to the syscalls on the platform. Finally, I created mkg3a, a tool to pack the raw binaries output by the linker into the g3a files accepted by the device.

Rumor has it the whole set of tools works. I haven’t been able to verify that myself since I don’t have a Prizm of my own, but it’s all out there. Tarballs of the whole package are over on Jon’s site, for anyone interested.

# Pointless Linux Hacks

I nearly always find it interesting to muck about in someone else’s code, often to add simple features or to make it do something silly, and the Linux kernel is no exception to that. What follows is my own first adventure into patching Linux to do my evil bidding.

Aside from mucking about in code for fun, digging through public source code such as that provided by Linux can be very useful when developing something new.

## A short story

I was doing nothing of particular importance yesterday afternoon when I was booting up my previously mentioned netbook. The machine usually runs on a straight framebuffer powered by KMS on i915 hardware, and my kernel is configured to show the famous Tux logo while booting.

Readers familiar with the logo behaviour might already see where I’m going with this, but the kernel typically displays one copy of the logo for each processor in the system (so a uniprocessor machine shows one tux, a quad-core shows four, etc..). As a bit of a joke, then, suggested a friend, why not patch my kernel to make it look like a much more powerful machine than it really is? Of course, that’s exactly what I did, and here’s the patch for Linux 2.6.38.

--- drivers/video/fbmem.c.orig	2011-04-14 07:26:34.865849376 -0400
+++ drivers/video/fbmem.c	2011-04-13 13:06:28.706011678 -0400
@@ -635,7 +635,7 @@
int y;

y = fb_show_logo_line(info, rotate, fb_logo.logo, 0,
-			      num_online_cpus());
+			      4 * num_online_cpus());
y = fb_show_extra_logos(info, y, rotate);

return y;

Quite simply, my netbook now pretends to have an eight-core processor (the Atom with SMT reports two logical cores) as far as the visual indications go while booting up.

## Source-diving

Thus we come to source-diving, a term I’ve borrowed from the community of Nethack players to describe the process of searching for the location of a particular piece of code in some larger project.

Diving in someone else’s source is frequently useful, although I don’t have any specific examples of it in my own work at the moment. For an outside example, have a look at musca, which is a tiling window manager for X which was written from scratch but used ratpoison and dwm (two other X window managers) as models:

Musca’s code is actually written from scratch, but a lot of useful stuff was gleaned from reading the source code of those two excellent projects.

A personal recommendation for anyone seeking to go source-diving: become good friends with grep. In the case of my patch above, the process went something like this:

• grep -R LOGO_LINUX linux-2.6.38/ to find all references to LOGO_LINUX in the source tree.
• Examine the related files, find drivers/video/fbmem.c, which contains the logo display code.
• Find the part which controls the number of logos to display by searching that file for ‘cpu’, assuming (correctly) that it must call some outside function to get the number of CPUs active in the system.
• Patch line 638 (for great justice).

Next up in my source-diving adventures will be finding the code which controls what happens when the user presses control+alt+delete, in anticipation of sometime rewriting fb-hitler into a standalone kernel rather than a program running on top of Linux..

# Of Links and Kana

I sometimes use Links on various computers when I can’t be bothered to deal with a full graphical environment and just want to look something up. Given I also try to ensure that this site renders in an acceptable manner in text-only mode, Links is indispensable at times.

Now imagine my surprise when I discovered that Links will try to transliterate Japanese kana (a general term for the scripts in which characters correspond to syllables, rather than more abstract ideas such as in kanji) to some extent.

In that shot, Links has translated the kana in my page’s header to a reasonable romanization- the pronounciation of those characters would be Tari, as in the beginnings of ‘tan’ and ‘return’. I don’t know if that was a recent feature (I’m currently running Links 2.3pre1), but it was a pleasant surprise to see it romanizing kana.

# Obfuscation for Fun and Profit

One of the fun things to do with computer languages is abuse them. Confusing human readers of code can be pretty easy, but it takes a specially crafted program to be thoroughly incomprehensible to readers of the source code yet still be legal within the syntax of whatever language the program is written in.

Not dissimilar from building a well-obfuscated program is using esoteric languages and building quines. All of these things can be mind-bending but also provide excellent learning resources for some dark corners of language specification, as well as the occasional clever optimization.

## Obfuscation

It’s not uncommon for malware source code to be pretty heavily obfuscated, but that’s nothing compared to properly obfuscated code. What follows is some publically-released Linux exploit code.

ver = wtfyourunhere_heee(krelease, kversion);
if(ver < 0)
__yyy_tegdtfsrer("!!!  Un4bl3 t0 g3t r3l3as3 wh4t th3 fuq!n");
__gggdfstsgdt_dddex("$K3rn3l r3l3as3: %sn", krelease); if(argc != 1) { while( (ret = getopt(argc, argv, "siflc:k:o:")) > 0) { switch(ret) { case 'i': flags |= KERN_DIS_GGDHHDYQEEWR4432PPOI_LSM|KERN_DIS_DGDGHHYTTFSR34353_FOPS; useidt=1; // u have to use -i to force IDT Vector break; case 'f': flags |= KERN_DIS_GGDHHDYQEEWR4432PPOI_LSM|KERN_DIS_GGDYYTDFFACVFD_IDT; break; It reads like gibberish, but examination of the numerous #define statements at beginning of that file and some find/replace action make quick work to deobfuscate the source. Beyond that, the sheer pointlessness of ‘1337 5p33k’ in status messages makes my respect for the author plummet, no matter how skilled they may be at creating exploits. Let’s now consider an entry to the International Obfuscated C Code Contest (IOCCC) from 1986, submitted by Jim Hague: #define DIT ( #define DAH ) #define __DAH ++ #define DITDAH * #define DAHDIT for #define DIT_DAH malloc #define DAH_DIT gets #define _DAHDIT char _DAHDIT _DAH_[]="ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e'b.s;i,d:" ;main DIT DAH{_DAHDIT DITDAH _DIT,DITDAH DAH_,DITDAH DIT_, DITDAH _DIT_,DITDAH DIT_DAH DIT DAH,DITDAH DAH_DIT DIT DAH;DAHDIT DIT _DIT=DIT_DAH DIT 81 DAH,DIT_=_DIT __DAH;_DIT==DAH_DIT DIT _DIT DAH;__DIT DIT'n'DAH DAH DAHDIT DIT DAH_=_DIT;DITDAH DAH_;__DIT DIT DITDAH _DIT_?_DAH DIT DITDAH DIT_ DAH:'?'DAH,__DIT DIT' 'DAH,DAH_ __DAH DAH DAHDIT DIT DITDAH DIT_=2,_DIT_=_DAH_; DITDAH _DIT_&&DIT DITDAH _DIT_!=DIT DITDAH DAH_>='a'? DITDAH DAH_&223:DITDAH DAH_ DAH DAH; DIT DITDAH DIT_ DAH __DAH,_DIT_ __DAH DAH DITDAH DIT_+= DIT DITDAH _DIT_>='a'? DITDAH _DIT_-'a':0 DAH;}_DAH DIT DIT_ DAH{ __DIT DIT DIT_>3?_DAH DIT DIT_>>1 DAH:''DAH;return DIT_&1?'-':'.';}__DIT DIT DIT_ DAH _DAHDIT DIT_;{DIT void DAH write DIT 1,&DIT_,1 DAH;} What does it do? I couldn’t say without spending a while examining the code. Between clever abuse of the C preprocessor to redefine important language constructs and use of only a few language elements, it’s very difficult to decipher that program. According to the author’s comments, it seems to convert ASCII text on standard input to Morse code. Aside from (ab)using the preprocessor extensively, IOCCC entries frequently use heavily optimized algorithms which do clever manipulation of data in only a few statements. For a good waste of time, I suggest browsing the list of IOCCC winners. At the least, C experts can work through some pretty good brain teasers, and C learners might pick up some interesting tricks or learn something new while puzzling through the code. So what? Obfuscating code intentionally is fun and makes for an interesting exercise. ## Quines Another interesting sort of program is a quine- a program that prints its own source code when run. Wikipedia has plenty of information on quines as well as a good breakdown on how to create one. My point in discussing quines, however, is simply to point out a fun abuse of the quine ‘rules’, as it were. Consider the following: #!/bin/cat On a UNIX or UNIX-like system, that single line is a quine, because it’s abusing the shebang. The shebang (‘#!’), when used in a plain-text file, indicates to the kernel when loading a file with intent to run it that the file is not itself executable, but should be interpreted. The system then invokes the program given on the shebang line (in this case /bin/cat) and gives the name of the original file as an argument. Effectively, this makes the system do the following, assuming that line is in the file quine.sh: $ /bin/cat quine.sh

As most UNIX users will know, cat takes all inputs and writes them back to output, and is useful for combining multiple files (invocation like cat file1 file2 > both) or just viewing the contents of a file as plain text on the terminal. Final result: cat prints the contents of quine.sh.

Is that an abuse of the quine rules? Possibly. Good for learning more about system internals? Most definitely.

## Esoteric Languages

Finally in our consideration of mind-bending ways to (ab)use computer languages, we come to the general topic of esoteric languages. Put concisely, an esoteric language is one intended to be difficult to use or just be unusual in some way. Probably the most well-known one is brainfuck, which is.. aptly named, being Turing-complete but also nearly impossible to create anything useful with.

The Esoteric language site has a variety of such languages listed, few of which are of much use. However, the mostly arbitrary limitations imposed on programmers in such languages can make for very good logic puzzles and often require use of rarely-seen tricks to get anything useful done.

One of my personal favorites is Petrovich. More of a command interpreter than programming language, Petrovich does whatever it wants and must be trained to do the desired operations.

# Raptor Speech

In a fit of boredom this evening, I tried to see what the speech recognition in Windows 7 would give back when I made raptor noises into it. The result.. speaks pretty well for itself:

F and has and has a Hack it has A hack who know Her house Just how hot enough And who know how It has had To add up data at data to go out and It’s all of all Go ahead goal happened: how has a Staff headed to a

And if his own booth for th FFI have had for the hand-held her and who often have no

# Btrfs

I recently converted the root filesystem on my netbook, a now rather old Acer Aspire One with an incredibly slow 1.8″ Flash SSD, from the ext3 I had been using for quite a while to the shiny new btrfs, which becomes more stable every time the Linux kernel gets updated. As I don’t keep any data of particular importance on there, I had no problem with running an experimental filesystem on it.

Not only was the conversion relatively painless, but the system now performs better than it ever did with ext3/4.

## Conversion

Btrfs supports a nearly painless conversion from ext2/3/4 due to its flexible design. Because btrfs has almost no fixed locations for metadata on the disc, it is actually possible to allocate btrfs metadata inside the free space in an ext filesystem. Given that, all that’s required to convert a filesystem is to run btrfs-convert on it- the only requirement is that the filesystem not be mounted.

As the test subject of this experiment was just my netbook, this was easy, since I keep a rather simple partition layout on that machine. In fact, before the conversion, I had a single 8GB ext4 partition on the system’s rather pathetic SSD, and that was the extent of available storage. After backing up the contents of my home directory to another machine, I proceeded to decimate the contents of my home directory and drop the amount of storage in-use from about 6GB to more like 3GB, a healthy gain.

### Linux kernel

To run a system on Btrfs, there must, of course, be support for it in the kernel. Because I customarily build my own kernels on my netbook, it was a simple matter of enabling Btrfs support and rebuilding my kernel image. Most distribution kernels probably won’t have such support enabled since the filesystem is still under rather heavy development, so it was fortunate that my setup made it so easy.

### GRUB

The system under consideration runs GRUB 2, currently version 1.97, which has no native btrfs support. That’s a problem, as I was hoping to only have a single partition. With a little research, it was easy to find that no version of GRUB currently supports booting from btrfs, although there is an experimental patchset with provides basic btrfs support in a module. Unfortunately, to load a module, GRUB needs to be able to read the partition in which the module resides. If my /boot is on btrfs, that’s a bit troublesome. Thus, the only option is for me to create a separate partition for /boot, containing GRUB’s files and my Linux kernel image to boot, formatted with some other file system. The obvious choice was the tried-and-true ext3.

This presents a small problem, in that I need to resize my existing root partition to make room on the disc for a small /boot partition. Easily remedied, however, with application of the Ultimate Boot CD, which includes the wonderful Parted Magic. GParted, included in Parted Magic, made short work of resizing the existing partition and its filesystem, as well as moving that partition to the end of the disc, which eventually left me with a shiny new ext3 partition filling the first 64MB of the disc.

## Repartitioning

After creating my new /boot partition, it was a simple matter of copying the contents of /boot on the old partition to the new one, adjusting the fstab, and changing my kernel command line in the GRUB config file to mount /dev/sda2 as root rather than sda1.

Move the contents of /boot:

$mount /dev/sda1 /mnt/boot$ cp -a /boot /mnt/boot

## The script

Having implemented word generation in the module, it was reasonably short work to wrap the whole thing in a script so it could be invoked from the command line for great lulz.  Something like the following does a decent job of providing amusement by generating a word every 15 seconds.  For more fun, pipe the output into a speech synthesizer.

Tari@Kerwin ~ $while markov.py; do sleep 15; done Of course, before anything can be generated, a graph must be generated, which can be done via the -s option on the script or by invoking the addString method of MarkovMap. Quick example: Tari@Kerwin ~$ # Add the given string to the current graph, or to a new one.
Tari@Kerwin ~ $markov.py -s"String to seed with" -ffoo.pkl IO error on foo.pkl, creating new map seeeeed Tari@Kerwin ~$ # Add some Delmore Schwartz to the map via stdin
Tari@Kerwin ~ $markov.py -ffoo.pkl -s- << EOF > (This is the school in which we learn...) >What is the self amid this blaze? >What am I now that I was then >Which I shall suffer and act again, >The theodicy I wrote in my high school days >Restored all life from infancy, >The children shouting are bright as they run >(This is the school in which they learn...) >Ravished entirely in their passing play! >(...that time is the fire in which they burn.) >EOF idagheam Tari@Kerwin ~$ # Generate a word from the default graph in file markov.pkl
Tari@Kerwin ~ $markov.py awaike Tari@Kerwin ~$

Easy enough.  I’ve found that a Maori seed (via Project Gutenburg) makes for some of the more easily pronounced words, but any language will (mostly) generate words that are pronounceable via that language’s pronunciation rules.

For seeding with non-Latin character sets, the script can take the -l or –lax option (‘strict’ keyword parameter to MarkovMap.addString()), which removes the restriction keeping graphed characters as only alphabetic.  The downside, then, is that everything in the input is mapped out, so you’re much more likely to get garbage out unless the input is carefully sanitized of punctuation and such (GIGO, after all).

## Code

Enough talk, I’m sure you just want to pick apart my code and play with nonsense words at this point.  Download link is below.  I’m providing the code under the Simplified BSD License so you’re allowed to do nearly anything with it, I just ask that you credit me for it in some way if you reuse or redistribute it.

Download markov.py

# Wednesday link dump

Because I have nothing better to do right now, it’s a good time to dump the interesting links that I’ve been accumulating.

• While radioactive hunks of matter are often portrayed as glowing with a green tinge, we all know that’s not actually true.. unless there’s Cherenkov Radiation involved, as in many nuclear reactors- that’s not green, though.
• Google have (for now) won the suit against them by Viacom regarding copyrighted content being uploaded to YouTube, which is good news for everyone except maybe Viacom.  It’s still fun to read choice excerpts of correspondence involving all sorts of mudslinging in the case (warning: lots of curses).
• OpenStreetMap is a neat project to create free maps, similar to Google Maps, Bing Maps, etc.  Cool stuff, and all the map data is Creative Commons, meaning it could be used for any number of shiny projects.
• There might be life on Saturn’s moon, Titan, observations courtesy of the NASA/ESA/ASI Cassini mission, which has been bouncing around the Saturnian system since mid-2004 after launch way back in 1997.  It’s far from a sure thing, but it’s really exciting that predictions of how life might work on Titan have been supported by observation.
• This study (PDF) of internet routing to previously unused blocks is quite interesting, especially the numerous SIP streams pointed at 1.1.1.1 (section 5.1).
• The EFF (kind of like the ACLU of internet, if you’re not familiar with them) recently put out the HTTPS Everywhere extension for Firefox.  When it’s this easy to lock down your web traffic, there’s no reason not to.  What’s your excuse?
• Huge things are cool.  Want to feel tiny?  Go ask Wikipedia about the local supercluster, then consider how tiny everything humanity knows is, relative to that.  When you’re done scrabbling about in your own Total Perspective Vortex, consider epic timescales for extra kicks.  Yeah.. cosmology is awesome.

..and that’s several weeks of accumulated cool-things.  Enjoy.