Skip to content

another system definition facility

revisiting language decisions.

dream stack:

* master – oz
* sim – k
* mdl – qi

lols. all-star. realistic:

* master – just writing it in erlang instead of poorly and distractingly reimplementing it piecemeal in scala (akka) / python (???). it’s shaping up to be spinoffable into an independent project.
* sim – obviously staying c++ / asm / cuda
* ports – python (~everything~ speaks python, yay pip)
* stats – probably python (numpy / scipy / pandas) – only other competitior is J
* mdl – current question. contenders: python, haskell, ocaml, scala. requirements: parallel exec, DS-mini-L handling, 3d cellular automata execution and concise serialization. needs to talk to kyotocabinet, needs a 3d math lib, ideally a json-rpc lib, maybe a parsecy lib.

python’s almost def out for mdl sadly, as loath as I am to depend on yet another language. it’s bad with tiny packing data, bad with dense 3d vector math, and terrible with parallel exec – mdl entities just got heavyweight, they have a local cache of spatial info per host, so in python you’d need 1 + #cores processes per box all jabbering over pipes, gotta worry about one going down not others, etc etc. ick.

it looks like a great fit for scalakka, but I’d really like to avoid having a jvm dep. or a scala dep. sbt, maven, odersky, oh my. don’t get me wrong, I love the language, but it’s still growing, people are still figuring out how to do things, and the jvm’s a huge bitch. a glorious jit, but so, so much ceremony I don’t want. so far I’m at a relative minimum of cruft, and I’d like to keep it that way. the jvm seems to like to own things. plus, where the package manager at?

so ocaml vs haskell. as far as language maturity goes ml’s got that down, haskell not so much but seems pretty good about not introducing breaking changes. I’ve never done anything in ocaml, I have no idea what it has in terms of packages and am almost sure there’s no kyoto binding for it, but I’ve seen what it can do with vector numeric calcs, and am really intrigued by its ability to modify its own grammar. the idea of updating blocks of 4k 3d vecs at a time, in place, many many many times, just seems anathema to haskell. yeah it can do it,

I dunno. I’d prefer haskell, but ocaml seems like a better fit from this distance, and would be fun. I guess I’ll compare raytracer implementations, pick the prettiest one I can find. exciting.

-

definitely getting a van. I realize how much I miss loud, loud, high quality, loud, high quality music at my disposal. city life, thin apartment walls don’t allow that. I just want a mobile room I can drive down to the beach and blast and code in, take to the woods for the weekends, take to festivals, road trip in. it doesn’t need to be some full on decked out camper van thing with running water and a stove and crap, shit I barely use my stove at home and I could pretty much live with just a jug of water instead of a sink. still gotta be a 4×4 though, been to one too many rainy, muddy festies.

-

oh yeah, a data structure I might wind up having to write: distributed, transactional, lazily cached octree. you could do some weird shit with that. I’ve also stumbled upon sly3, a smalltalk impl that claims to be not just lockless, but “anti-lock”. take that amdahl. I’ve always known my sim is gonna be this kind of async (sorta), but I might even explore it for the mdl now that I’ve seen some cold hard research on it. pinteresting.

evodevo

hopefully in 5 years we’ll all be speaking ithkuil writing an apl descendent on massively parallel vliw architectures. with ai and synthetic life and safe nuclear and all that, too.

-

master node is evolving rapidly, aided heavily by an impromptu insomnia day yesterday. I find that sometimes that shift in state of mind toward delirium can be conducive to my technical creativity, planned or not.

so I was gonna manually handle everything, the different entities would have to connect back to the master to form a control channel. yet entities come and go (model processors for example), the master needs to spawn and maybe kill them, so it’d need to be able to launch stuff. ok so I guess you need a controller entity, now it needs to be watched, yadda yadda.

long story short now I’m thinking entities are just unix processes, the master literally ssh’s or rsh’s to hosts and controls them directly. json speaking entities will now literally just gets() and puts() lines of json. even better now not all entities need to speak json, for example converting cells from the mdl outputted text format to the sim’s internal binary format can be done stdin (text) -> stdout (binary). if it takes too long it can just kill the process, or just kill the conn and let it hup. monitoring host loads is now also solved, just built in unix commands (or a little py lib, it’s all just stdin/stdout now).

I want the master to control resources entirely, to the point that I don’t want entities to know about them AT ALL, to be literally unable to concat together a file path for a resource. they obviously need to get paths, sim needs to mmap every binary converted cell file, but they’ll come from the master. I may make an exception for another librarian kind of entity that makes sure all the important resources are kept safe and not just living un-duplicated on hosts which may burst into flames at any moment, but that’d be the exception. in general master will do that.

so now another question I didn’t even realize I had has been answered, I have a slave host fs hierarchy developing:

minerva3
--bin
--logs
--resources
----cells
------text
------binary (actually-mmapped files)
----model
------markup
------fragments

cooler still I can (and do) abstract the notion of Host away into a very few primitives (entity mgmt, resource mgmt, load / capacity info) to support weird shit down the line. on rsh v ssh I’d at first prefer the simplicity of rsh to ssh – I really, really don’t want it to be encrypted. not only for overhead but debuggability and simplicity. I’m targetting ec2 really and all intranet traffic is encrypted between your nodes anyway, and I’m sending tons of unencrypted data around. buuut ssh can do cool shit like built in file xfer which may be handy (or may not, resource xfer is host->host, master shouldn’t ever act as a content proxy), it might be able to do cool tricks with file handles allowing for just one ssh conn per host instead of one per entity, it’s gotten credentials all thought out, and it would let me do this over unencrypted channels (non ec2). meh, it’s probably the right choice largely for that last reason, the rest are speculative. but abstract host implementations let me change shit like this and even let them coexist.

even coooooler is that this supports elasticity, the master’s host driver can acquire a new host, send it bins, setup the base fs, and add it to the pool for the master to spawn shit on at will. similarly it can kill them all and release it.

ice cold is that the master really will be an erlang style giant web of tiny actors and machines, ultimately sending dead simple commands to dumb entities on remote hosts. I really could write it in erlang or akka now if I wanted to, because of how decoupled it is from whatever the entities do, and it, the entire master system, would scale extremely well – the master could itself live distributed on multiple servers with the actor model, so for 5000 nodes one poor python instance won’t have to drive 5000 ssh connections. this is starting to own. it still does need to make decisions though (load balancing, efficient entity positioning), but I might delegate those to the very vaguely defined stat entities now (since they look alot like the algs using for chopping cells and lod and shit).

** the term ‘node’ in the context of ‘sim node, mdl node, port node, stat node, …’ has been banned for confusion with the model’s definition language’s IR – model nodes are things like ‘Times’, ‘Choose’, ‘Rotate’, ‘RandPoint’, ‘Cortex’, etc. entity is better, hosts host entities, running instances of entity types.

I’m really overdue for such an epiphany about the mdl dsl. god damn.

these developments brought to you by frantically scribbling in a notebook in dolores park.

-

evolution is quite a word. to some it conjures up thoughts of the origin of species, of the vortex of biology and chemistry and physics and math that resulted in us, perhaps catapulting them into a full on daydream. to others it’s an incendiary topic they see as an affront to things they believe more deeply than anything else and can result in hostility. more generally it can mean the incremental development of something over time in response to its environment, perhaps making people think of great works or ideas they’ve studied, followed, or even contributed to in their lives. personally though I’m forever doomed to think first of richard dawkins, and then immediately of emma watson. god damn it, internet.

pesudo-minimalism / minerva stuff

Materialist ramble, and not the philosophical kind.

I like being minimalist. I’m not one of those count-everything-you-own types, inventorying every fork and spoon and plate I own, but by and large the only things of note that I [want to] own which I don’t carry around in my backpack all day are my bed and a beanbag. I really need to sell the four computers I haven’t turned on in months. My old desktop has been eclipsed in everything but gpu by my laptop – I don’t really game anymore, and amazon can give me cuda when I need it. And my old 6 drive zfs raidz2 array’s contents now fit on a single external hooked up to my diehard netbook server, with the important stuff backed up elsewhere. I suppose I’ll always have a desk with a keyboard, mouse, giant monitor, and godly headphone rig, but it’s now just external accessories for my laptop. I’ve said it before, but being able to sit down practically anywhere – in a bar, on a beach, in the park – with my full workstation and broadband is incredibly awesome. My dad said it, but he was into it before laptops were powerful enough for me. My brother said it, but that was in Jacksonville where pretty much the only interesting place to hang out is exactly one hookah lounge. Here and now however I’ve finally seen that light, and it’s bright.

A backpack with the most badass laptop available, a trusty android tablet, a rockbox ipod 4g, the most badass earbuds and usb dac/amp available, the most badass android phone available, and, like, a mouse and stuff. Really all I want. I know many people this age for countless generations have gone through this ‘phase’, but times really are different. With the laptop alone I have every book movie song and line of code ever or yet written. The collective knowledge of mankind and all that. I have unfettered access to as much computing resources as I need via the proverbial cloud. I have games, should I want them. This has never, in the history of the known universe, happened before. Used to be to get information you had to ask people in person, or write them letters, or consult books of your own or of a library’s, and to do research you had to be at a university or lab and have all kinds of crazy, heavy, expensive equipment at your fingertips. I don’t expect people used to the way things have always been, or as my inner singularist would phrase it people used to linear growth, to understand this, but physical objects and even physical location are now just conveniences, not requirements. What an age.

As much as I’ve loved my 240sx I really can’t wait to sell it. It’s been a brilliant machine and has served me unspeakably well, but I think that phase of my life has ended. The only [US] cities I can see myself living in now are SF and NYC which have real public transpo, and Boulder, and it wouldn’t fare well at all in the snow.

I suppose I have exactly three tiers of material want:

1) My attainable, realistic backpack.
2) A goddam decked out camper van. I’ve always dreamt of taking a year and hitting up every good festival in the country, and camping out in the middle of nowhere with my whole house, and [secondarily] seeing the many interesting cities on the continent. I’m almost embarrassed to admit how much I’ve thought this out. This is mid-term attainable, but I’m not sure I’ll come into that kind of cash before I’m too old to enjoy it. Meh, it’s a distant blip on the radar.
3) A goddam airship. Hey, everyone needs goals.

Seeing only those the theme seems to be more of nomadism than of minimalism, but the second two are largely pipe dreams, rewards for something I’ve yet to accomplish. By far most importantly I’m a being of information, and a friggin laptop and broadband is all I need for complete self-realization. Granted my projects folder is a graveyard of hundreds of cloned git repos and pdfs and stuff, but speaking purely physically.

Getting a job in the city. Not just ‘some job’, like I didn’t just move to ‘some city’. A perfect fit, all boxes checked. I’m still in that phase of initial disbelief, really – I’ll believe it when I start there. Literally dream come true. But this means I don’t need to drive, and I can move to the mecca of mission, a hazy stagger from ritual, the park, noisebridge, and sweet, sweet music venues. Block parties and community, not a sterile commercial barrens (ALLIES IN XR). A cheaper, smaller, simpler apartment away from the highway and even more overpriced boutique soma. I don’t need or want all this space – all I need is a room. To fit my bed and beanbag. I see people constructing, lusting for, and living in such giant homes elsewhere and I honsetly wonder what they need all that stuff and space for. I can see the appeal of having land of your own, trees and lakes and wildlife, a homestead, but it’s not me. And, of course, I’m just some dude, not a family. Different strokes I guess, and this is mine. Giggity.

It’s somewhat chicken and egg: do I love deletion and simplification because I am at my core a coder, a constructor, a purity seeker, an unapologetic but not soulless reductionist, or vice versa? A ponderance, but by no means an evaluatory crisis. Almost like asking which side of an arch was built first.

page break

Man, actors and state machines. I had written off the ‘master’ part of my project as small earlier, but now it needs to exist, and it’s become less and less so. I guess that’s part of the game – when you try to keep each component simple the shifted duties have to wind up somewhere. Mister master has wound up getting stuck with all the managerial arbitration I don’t want to pollute the machine gun sim nodes or fanciful mdl nodes with. It’s obviously for the best, now I have no ugly distributed spider web of machinery living on different machines, it’s all under one monolithic, GIL’d, garbage collected roof, all queryable and trackable without worrying about the word network, but damn sure it’s still no hello world.

I’ve been trying to do it erlang style, and I think I’ve settled on at least a vague silhouette of what I want it to be. The best actor offerings I know of are erlang and akka, and potentially mozart/oz, but I’m going to be keeping it all python and si-pi-pi. Pythons actor libs seem to be relatively lacking, all going off on poorly fitting directions or having not been updated since 2004 (candygram). Thankfully I’ve found pykka. gevent driven. I’m pretty set on using twisted to handle the ugly socket bits on the py side, and delightfully it can be run with a gevent reactor – hooray harmony. So I’m currently mulling over the interplay of actors, machines, and twisted handlers. Sim and mdl nodes are to remain as stupid as possible – sync and async command handlers with as little state as can be. I don’t want them being machines. They handle commands which can be issued, queried, and cancelled. Concessions must be made, for example the sim nodes must loop until they’re told to stop to avoid comically inefficent master round trips and mass joins, but for the most part they wear helmets. So as for the master, where everything lives: are all actors machines? Are all connections actors? Are all commands messages? Are all messages transitions? Are all simple, linear sequences of transitions their own machines? And vice versa to each of these. What’s in my head at the moment is an all powerful sim machine created on connection which creates and kills child machines as necessary, where each machine is a pykka actor. On the sim side commands need to have ID’s to be queryable and cancellable, but I don’t think all commands require an actor or a machine. I don’t have an example of this at the moment, however. I thought I did. It really looks heavyweight, but none of this happens per frame – master control commands are all rare, catastrophic events like loading cells and spinning up nodes and whatnot. However it may turn out it’s an interesting thought experiment. I’m consciously trying to force myself to be somewhat large about this side, to contrast my obsession with bits on the sim. Don’t be afraid of allocation, let’s see what happens.

On the bits front though I’ve been trying to implement an absolutely minimal lockless stack for the sim page allocator (free list). I think I’m going to be using TBB again – the atomics are (ALMOST) bulletproof, I’d need to write a threadpool / workqueue thing to handle async commands anyway, and wow is its TLS and thread local mem allocator gorgeous. As for the ALMOST: hey intel how about lock cmpxchg16b? TBB simply can’t currently emit that. Apple’s assembler certainly can’t (it won’t even do avx, cause not all macs have avx and mother apple won’t let you compile things that might not run on everymac). I tried asm(“.byte 0xFE 0xAC”) or whatever it is but it didn’t work and I gave up. I don’t need it, I was just curious really.

Oh yeah, lockless stack. concurrent_queue (this is just a freelist so it doesn’t matter if it’s a queue or stack and there’s no tbb::conc_stack) is hilariously heavyweight for something as trivial but inner-loopy as page allocation here. It yields. Even tbb::spin_mutex yields. You can probably configure it to not, but it’s still too much. Furthermore to use it (conc_queue) for a page freelist they’d necessarily heap alloc nodes because TBB doesn’t support intrusive datastructures. Weeeeak. So I’ve banned TBB, with the exception of atomics, from the holy inner loop, and thus need to write the six lines of code needed for a lockless intrusive stack. Using a ‘hands off’ sentinel of -1 it (pop) works out to this (noinlined):


00000001000016e0 <__Z9test6_popR11AtomicList6>:
1000016e0: push rbp
1000016e1: mov rbp,rsp
1000016e4: nop WORD PTR [rax+rax*1+0x0]
1000016ea: nop WORD PTR [rax+rax*1+0x0]
1000016f0: mov rax,0xffffffffffffffff
1000016f7: lock xchg QWORD PTR [rdi],rax
1000016fb: cmp rax,0xffffffffffffffff
1000016ff: je 1000016f0 <__Z9test6_popR11AtomicList6+0x10>
100001701: mov rcx,QWORD PTR [rax]
100001704: mov QWORD PTR [rdi],rcx
100001707: pop rbp
100001708: ret

(stealth edit: pasted wrong thing)

As stores don’t get reordered with respect to other stores a fence instruction isn’t necessary (but TBB can do that too – __TBB_full_memory_fence(); – I think it’s undocumented, and I can’t figure out how to get lfence or sfence and not just mfence, but hey it’s something). I am a little bummed about the mov, xchg, cmp, je loopage – with cmpxchg you can just do cmpxchg and jnz back directly to it, but wow does it not really matter. And for completeness:


00000001000016b0 <__Z10test6_pushR11AtomicList6PNS_4NodeE>:
1000016b0: push rbp
1000016b1: mov rbp,rsp
1000016b4: nop WORD PTR [rax+rax*1+0x0]
1000016ba: nop WORD PTR [rax+rax*1+0x0]
1000016c0: mov rax,0xffffffffffffffff
1000016c7: lock xchg QWORD PTR [rdi],rax
1000016cb: cmp rax,0xffffffffffffffff
1000016cf: je 1000016c0 <__Z10test6_pushR11AtomicList6PNS_4NodeE+0x10>
1000016d1: mov QWORD PTR [rsi],rax
1000016d4: mov QWORD PTR [rdi],rsi
1000016d7: pop rbp
1000016d8: ret

you gotta love ‘nop’ taking arguments, lols. I know it’s more efficient, it’s just funny. And there will be a much fatter sad path in here when head == NULL in which case it needs to allocate a new slab and thread the next pointers and shit, but that’s all init time really, once enough pages come to life it won’t happen. Overall I’m okay with this.

So now I’m also wondering how to do the machine guns – the num-of-cores threads that furiously run through every cell running a sim stage till there are none – most direct approach is queue n affinity-locked tasks with a completion task that depends on them every stage every frame. It’s allocy, but what like 5 stages * 4 cores allocs per frame? Wow I shouldn’t care about that. At this point it feels like I’m building a life size eiffel tower out of toothpicks. Hey everyone needs a hobby.

page break

And the sweet sweet mdl. So with the current draft the structure of the net before neurons must fit snugly in everyone’s mem without a peep – that means every mdl node instance, of which there are num-of-cores per mdl box, knows where every little neocortical column is, and has a duplicate copy of that jazz in its mem (barring fork COW magic). I think this is doable. I’m seeing:

1) Gen till neurons
2) Map neuron generation to workers, either spatially or just buck wild randomly, dumping them all to flat files
3) Reduce all them flat files into an octree of neuron positions
4) Map synapse generation to workers, as above, dumping them all flat files
5) Reduce all that garbage into a humungous octree of synapse positions

Basically. There’s still a ton of question marks:

* Chopping them up into cells?
* Neurons and synapses have STATE, they grow and die and branch and shit, a simple octree alone isn’t sufficient
* Sharded octrees / state db’s? How you gonna handle edges?
* There’s also neurites, neural projections that spawn synapses but aren’t neurons themselves – kinda need them
* Will all syns really be ignorant of eachother? All neurites? All neurons? Or will it be stepped or something, jittering and tweaking the phases until some condition is met?
* Oh yeah the whole DSL is still in bits in pieces, I haven’t even decided the best way to generate radially distributed points yet (deceptively difficult to do cleanly, especially when I want density to vary arbitrarily by distance from the center).

But progress. Definitely heavily inspired by my data creator, in fact started as a straight python port of it. Brevity brought to you by __subclasses__().

^L

Oh, and finally on the spikerouter protocol. I have decreed that sims shall have three network.. things..: a udp spike listener, a tcp spike router update listener, and a tcp master command connection. udp spike sink will be dead simple, just taking (for now) bigass chunks of fired neuronid’s (really exportidxs, long story) of stuff that fired (for now, really a bit more complex than that but that’s the gist). Master conn is obvious conn. The spike router update listener is the recently materialized one – it’s sim <-> sim (well sim entity <-> sim entity), and connections are one time use. A sim connects to another when it needs to tell it that it wants to change the neurons it imports from it. It’s a linear transaction of ‘hi this is what I want from you’, ‘ok this is what I want from you’, ‘ok bye’. No fully connected disgusting network of persistent sim <-> sim connections, no statefulness to speak of, just a brief exchange, two ships passing in the night. tcp takes care of everything getting there, and in this case it’s important. For the actual udp spike exchanges I intend them to eventually be lossy, but more to the point they shouldn’t need to be acking all over the place when it happens so much. I am happy with this. Hopefully boost::asio will play nice with tbb, because I don’t want to write an async socket server, nor do I want yet another dep. boost, tbb, and sqlite are plenty for the sim. The python projects however, particularly the stats nodes, kitchen sink that shit. pip whorin it up. pandas you say? I don’t know what it is but I’ll take three.

I like how this post started out all poetic and ended up a grotesque, incomprehensible, vulgar, poorly capitalized mess. I feel like a perl coder.

prequel

It doesn’t look like I’ve ever blagged about yet another backburnered project: prequel. Coffeescript for sql. NoSQL has its place, and SQL has its place – there’s still tons of valid reasons to write raw sql. Frankly I enjoy it. It is however about as expressive as fortran due to the diff vendors trying to maintain some consistency between eachother (and some vendors making gobs of cache with consulting and training). Prequel will not only provide a suite of high level features like, say

* iterators that don’t involve 10 lines of boilerplate
* comprehensions and generators
* vars without declarations, with type inference
* inner functions
* includes

it will do so letting you selectably target different backends without worrying about the implementation details. should you want to. Part of why I like C++ is that I can straight paste C, I can interop with any standard format shared libs, I can inline or link to assembly with no headache. Similarly this should be a strict superset of common sql and let you write raw code for whatever your platform is without getting in the way. at all.

Additionally it could do cool stuff like yell at you about stupid joins. Using higher level features it could even recombinantly try different compilations looking for the fastest one against real data (is it faster to dump this into a temp table? is this index on this temp table beneficial? …). P sweet piece of vaporware. I should write this, or at least someone should.

sqlitedb

I put my little SQLite wrapper on github. bam. Per the readme all the other ones I found were grossly overcomplicated when all I want is open, execScalar, execCursor, operator[], and RAII. Loading my cells got to be too much for poor old std::map, lol.

Public domain. Woo.

eemahqs

Been a while since I’ve written anything so I guess I should core dump.

* Emacs has become my default anything environment. I kind of miss intellisense, but not terribly. For my C++ I have tags which works on well formed stuff and for the template and preprocessor heavy stuff VS didn’t have a prayer anyway. For Python I could use pymacs if I were so inclined but I haven’t found it terribly necessary. I’ve deprecated viper in favor of evil and that’s been all roses – best of both worlds, though it has made pure emacs or vim feel unnatural. Honestly having completed that sentence I hit C-l (my ‘escape’ bind) C-x C-s, to which chrome responded with no action. My config has two bugs I need to track down, and the syntax highlighting for scala and C++ (particularly C++11) leave much to be desired, but those are hints, and code is just text. I’m quite happy. I do wish eshell supported input / output redirection though (but love how I can do it to buffers).

* Python is definitely my goto language (buh dum chik). Yeah its lambdas are crippled. Yeah its handling of locals leaves much to be desired. Yeah, v3 growing pains. Yeah, the GIL. Big deal. pip. scipy. As little as I use it I respect ruby, and I do wish python were more dynamic at times, and as little as I use haskell I wish python were more functional at times (currying!), but at the end of the day py works. It works anywhere, it’s reasonably expressive to dev in, its native data structures are very fast, its packages work out of the box, if I need to I can move it around with virtualenv, and its repls are great. Weapon of choice.

* Complementary to the above I’ve gotten very used to nix dev. Woo. So much so that now Win and VS feel alien. I feel boxed in without a suite of tools that all play together, just like all those blog posts said I would. Bittersweet. Wow do I not miss the registry.

* I’m stuck on my neural net. I need to generate neurons connected well enough that they’ll do something useful. I want it to be *trivial* to specify new modules (sensory and motor cortices grown around I/O ports, structures analogous to the thalamus and hippocampus, etc) and grow a network around them. The neocortex, the star of the show, is relatively simple – layers and columns distally connected by some central thalamic blob. Easy to chop into cells around the columns, easy to to LoD cause it’s so distributed. Other structures are way more difficult, much messier and huge fanouts. The sim has come along wonderfully though – it’s ready for allocless, lockless multithreading. My appropriately named SpikeRouter does its job fine, ready to connect via 0MQ to little hacky python scripts firehosing it webcam input or something. I know good and well I should put that aside and get some pretty graphs and stats going but it’s still all just based on random static nets. Looked at pov-ray for inspiration, considered a dsl, wandering through an xml based design reminiscent of my data creator for specifying random 3D neural distributions, not really happy with it so far, still have no idea how to specify synaptic connectivity, iffy about going back to a CA. Stuck. Feh.

* Rewrote the inner neuron step loop with SSE ops, more than doubled the speed. That was easy. Can’t wait to try AVX. Izh and his dudes managed to get it working efficiently in CUDA by using a bitvec of fires and just brute force bit scanning it every neuron every frame. n^2 but way, way faster in practice on GPU cause it removes an entire disgusting global mem scatter op, and resonates really well with my thing because all my cells are limited to 2^16 neurons, which with a max delay of 32 = an 8k bitvec, sweet. Really makes me want to try switching from a ring buf of fired neurons to a queue of current injections but that’ll make temp memory usage scale by synapse instead of neuron. I dunno, something to try later.

* A grotesque bottleneck is my little compression scheme for syn connections. I added a #define switch to kill it. It really does need to happen. I’ll rewrite it (the whole inner loop really) in assembly later – a proper linked nasm object because AT&T syntax is disgusting. As fun as that will be it is hardcore backburnered for the generation of, you know, not completely random cells.

* I respect Apple and what they do, legal team aside, but I just don’t think it’s right for my main machine. Like Ubuntu. Really all I want is a Lenovo w520 running debian 6 / xfce 4 with 32gb of mem and a droid bionic. osx, please don’t index a hundred thousand header files. You aren’t improving my user experience. But this is one of the delightful things about being an emacs person, home is where elisp is.

* Skyrim. Two solid days of hardcore gaming. And then I hit ~.

* Oh my god there’s a new tribes when did this

class acts

I really appreciate class act companies. A lot of companies, gigantic and ramen alike, tend to have their negative points, beit a terrible security track record, despicable patent lawsuits, or scumbag privacy intrusions. The ones that tip me off my dominating ‘meh’ equator into overall positive territory are relatively rare. In particular I have mad respect for Google, Amazon, and ASUS. Damn solid companies, brilliant products, strong track record, admirable foresight, and good intentions. I love those guys. Valve, you cultivate amazing games and are good people. id, you are gods.

time space tradeoff

So I’m tearing up my NN. I’ve finally, finally gotten the sim system jammed into a decent architecture. 13 bytes per neuron GPU, 8.125 bytes (bitvec lol) per synapse CPU, scatter/gather bullshit on CPU, repetitive embarrassingly parallel crunching on GPU, variably staggered stages to maximize utilization of CPU/GPU/network. C++ / CUDA sim, scala mdl, python utils, javascript visualizations (I should put some up here), brought to you by 0mq, sqlite, and json. Probably gonna throw in a central postgres or my server to track cell locations.

I’ve been focusing on the sim pretty exclusively. I want to get it perfect enough so I never have to look at it again. It’s a dumb node, taking commands from the model system. If anything goes wrong it just dumps the cell. I’m particularly proud of the memory efficiency I’m squeezing out of it. Honestly, while 30k syns/neur is the number to beat, 1k is acceptable, so with ~1G of neuron data on a GPU you’re looking at ~1T CPU syn data? Really dude? So 1) synapse memory efficiency must be pursued at all costs and 2) I’ve got a shitload of spare GPU memory to muck with (and probably time too). Honestly I’d love to dump the GPU entirely in favor of an as-lazy-as-I-am event driven sim, no stupid repetitive neuron stepping, but as neurons *CAN AND DO* fire with *NO* input, I don’t think that’s going to be possible. I’ll need to see a giant 3D plot of the Izh kernel functions to get a sense for what it does to begin to approach that.

I’ve been devving it in vs. I know, I’m sorry, but intellisense is beautiful. Astonishingly it compiles in g++ though. Thankfully all the crazy crap is relegated to 0mq and sqlite. I only have one VirtualAlloc call that I’ve replaced with new[] for testing that I’ll need to #ifdef away, and after that it was just a matter of putting a space between my template >’s and some other little thing. Pretty awesome. Getting stupid sick of not having closures though, I kind of want to target g++’s 0x extensions. I kind of don’t. This needs to build on windows, osx, and linux.

Honestly I find scala more expressive than python, and it’s obviously more performant (though often more memory hungry). .par is the most awesome thing ever, odersky you suave sob. Haven’t starting anything solid with the mdl side yet. I really don’t think there’s any way to do this cell reduction crap analytically – I’m just going to have to profile and take pot shots. And what about phase? This is why I need the sim to breathe, so I can finally find this shiz out. I’m so close.

SQLite is proving to be a better format for transmission of large relatively complexly structured binary data than JSON. Boost’s built in JSON parser (property_tree) is unbelievably bad. Python takes < 2 sec and 70 megs to load a 9 meg JSON file, boost took 5min 10s and 440 megs. My gawd. I know there’s better libs out there, but they’ll inevitably all be crazy allocy (std::string and map or vector) and that’s nasty. SQLite may have an archaic interface (sans a C++ wrapper), holy lord it’s fast. JSON for control, sqlite for data, and 0mq for comm. I hope zed didn’t patent that.

Jordan from Real Genius is hot.

some kind of movement

Straight mangling binfmt_elf.c is a nogo, too much kernel residue. Still not seeing any showstoppers thus far. I keep wondering why this hasn’t been done before – nobody else minds the overhead of a full kernel, virtualized hardware, partitioned memory, and a virtual harddisk? Feh.

Updating resume. Overanalysis overwhelming.

Kind of getting roped into an iPhone app involving nltk.

Note to self: you can use 32 bit ints on scala piles to straight up HALVE the memory usage of pile->pile refs. Glorious. This might take extra finagling for my particular impl (neurons, synapses) because of inter-pile refs, but it just might be worth it. I’m a real memory miser.

vdso

Wow, I can do the linux emu in 100% userspace. Apparently since 02 syscalls have been funneled through the virtual so ‘linux-gate’ to abstract away int 80 / sysenter / syscall. Guess what I’m gonna hook :3 Of course if anything actually tries to int 80 it’ll call a random syscall, but you shouldn’t be running this as a privileged user anyway. There seems like there’s relatively little to do, really – rip out binfmt_elf.c, use it to load ld-linux.so, copy/paste freebsd’s syscall translations as necessary. Can’t believe nobody’s done it before. What could possibly, possibly go wrong