Web history archival and WARC management

I’ve been a sort of ‘rogue archivist’ along the lines of the Archive Team for some time, but generally lack the combination of motivation and free time to directly take part in their activities. That said, I do sometimes go on bursts of archival since these things do concern me; it’s just a question of when I’ll get manic enough to be useful and latch onto an archival task as the one to do. An earlier public example is when I mirrored ticalc.org. The historical record contains plenty of instances where people maintained copies of their communications or other documentation which has proven useful to study, and in the digital world the same is likely to be true.

Read More…

HodorCSE

Localization of software, while not trivial, is not a particularly novel problem. Where it gets more interesting is in resource-constrained systems, where your ability to display strings is limited by display resolution and memory limitations may make it difficult to include multiple localized copies of any given string in a single binary. All of this is then on top of the usual (admittedly slight in well-designed systems) difficulty in selecting a language at runtime and maintaining reasonably readable code. This all comes to mind following discussion of providing translations of Doors CSE, a piece of software for the TI-84+ Color Silver Edition1 that falls squarely into the “embedded software” category.

Read More…

"A Sufficiently Smart Compiler"

On a bit of a lark today, I decided to see if I could get Spasm running in a web browser via Emscripten. I was successful, but found that something seemed to be optimizing out most of main() such that I had to hack in my own main function that performed the same critical functions and (for the sake of simplicity) hard-coded the relevant command-line options. Looking into the problem a bit further, I observed that not all of main() was being removed; there was one critical line left in. The beginning of the function in source and the generated code were as follows.

Read More…

Reverse-engineering Ren'py packages

Some time ago (September 3, 2013, apparently), I had just finished reading Analogue: A Hate Story (which I highly recommend, by the way) and was particularly taken with the art. At that point it seems my engineer’s instincts kicked in and it seemed reasonable to reverse-engineer the resource archives to extract the art for my own nefarious purposes. Yeah, I really got into Analogue. That's all of the achievements. A little examination of the game files revealed a convenient truth: it was built with Ren’Py, a (open-source) visual novel engine written in Python. Python is a language I’m quite familiar with, so the actual task promised to be well within my expertise.

Read More…

GStreamer's playbin, threads and queueing

I’ve been working on a project that uses GStreamer to play back audio files in an automatically-determined order. My implementation uses a playbin, which is nice and easy to use. I had some issues getting it to continue playback on reaching the end of a file, though. According to the documentation for the about-to-finish signal, This signal is emitted when the current uri is about to finish. You can set the uri and suburi to make sure that playback continues. This signal is emitted from the context of a GStreamer streaming thread. Because I wanted to avoid blocking a streaming thread under the theory that doing so might interrupt playback (the logic in determining what to play next hits external resources so may take some time), my program simply forwarded that message out to be handled in the application’s main thread by posting a message to the pipeline’s bus.

Read More…

Newlib's git repository

Because I had quite the time finding it when I wanted to submit a patch to newlib, there’s a git mirror of the canonical CVS repository for newlib, which all the patches I saw on the mailing list were based off of. Maybe somebody else looking for it will find this note useful: git clone git://sourceware.org/git/newlib.git See also: the original mailing list announcement of the mirror’s availability.

Matrioshka brains and IPv6: a thought experiment

Nich (one of my roommates) mentioned recently that discussion in his computer networking course this semester turned to IPv6 in a recent session, and we spent a short while coming up with interesting ways to consider the size of the IPv6 address pool. Assuming 2^128 available addresses (an overestimate since some number of them are reserved for certain uses and are not publicly routable), for example, there are more IPv6 addresses than there are (estimated) grains of sand on Earth by a factor of approximately 3 × 10^14 (Wolfram|Alpha says there are between 10^20 and 10^24 grains of sand on Earth).

Read More…

"Four"ier transform

Today’s Saturday Morning Breakfast Cereal: I liked the joke and am familiar enough with the math of working in unusual bases that I felt a need to implement a quick version of this in Python. Code follows. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 #!/usr/bin/env python def fourier(x, b): """Attempts to find a fourier version of x, working down from base b.

Read More…

Of Cable Modems and the Dumb Solution

I was studying in Japan last semester (which explains somewhat why I haven’t posted anything interesting here in a while). That’s a whole different set of things to blog about, which I’ll get to at some point with any luck (maybe I’ll just force myself to write one post per day for a bit, even though these things tend to take at least a few hours to write..). Background At any rate, back in Houghton I live with a few roommates in an apartment served by Charter internet service (which I believe is currently DOCSIS2). The performance tends to be quite good (it seems that the numbers that they quote for service speeds are guaranteed minimums, unlike most other ISPs), but I like to have complete control over my firewall and routing.

Read More…

Better SSL

I updated the site’s SSL certificate to no longer be self-signed. This means that if you use the site over HTTPS, you won’t need to manually accept the certificate, since it is now signed by StartSSL. If you’re interested in doing similar, Ars Technica have a decent walk through the process (though they target nginx for configuration, which may not be useful to those running other web servers).