My family has historically been very into genealogy research, so we've got lots of family tree info. I've been working with my aunt to put together a webapp to create/edit info, see relationships, show ancestors/descendants of individuals. We've also started adding photos recently, which is really cool.
Favorite feature we've built: "are you my cousin", where you choose 2 people in our database and it calculates how they are related. First time I got to write a search function outside of a classroom!
I'm sure Ancestry does all this and more, but it's nice to have something that we built that's just for our family. We also use it all the time to verify relationships/birthdays!
I've not put a ton of thought into a plan to open source it, but we've got comments from friends who also have a lot of genealogical info that it would be super helpful for their info, so maybe? It's basically just a CRUD Django app at it's core, so I don't know how interesting it would be.
I think a lot of people would be into this. If you're willing to, you should totally do it! You might get some good contributions from the community, too.
Apps like QRBot [1] have the ability to scan ISBNs (and barcodes generally), and have a "history" feature that keeps track of what you've scanned and lets you export (to CSV, among others). The app is free on both iPhone and Android (there is a paid version, don't know what extras it has or if it's just ad-free), but may want to verify how much history gets stored before you go scan-crazy.
From a US perspective (may apply elsewhere), for books published relatively recently (within the last ~20 years or so), the ISBN is often part of the barcode on the back of the book (ISBN-13s (the updated standard) start with 978, so this is a good clue that the barcode is an ISBN). For a period of time prior to that (and perhaps still applicable to Mass Market Paperbacks), there is a barcode on the back that is NOT an ISBN, but there is an ISBN barcode on the inside front cover. I've not discovered any systematic way to pull an ISBN out of a non-ISBN barcode (though I haven't dug too far -- my collection hasn't reached 4 digits yet and I've been happy to type when scanning wasn't an option).
Once you have the ISBNs, I like to query against the Open Library API [2], which is a part of the Internet Archive. The information in there is fairly robust, if inconsistent (the capitalization of titles is sometimes as printed on the title page, sometimes Library of Congress format, other minor things). They have a lot of data points available, such as cross-referenced IDs with Goodreads and LibraryThing, but again, this is community-supported data, so YMMV as to completeness or accuracy.
Another note -- many books have separate ISBNs for hardcover editions, trade paperback editions, mass market editions, eBooks, etc (and sometimes don't have an ISBN at all for things like Book of the Month Club editions). I don't know if this is a requirement, or a luxury that big publishers have, but it is something I've noticed (you'll sometimes see multiple ISBNs listed on the copyright page, along with their formats -- also you may see related editions on Indiebound [3], along with their ISBNs). A cursory glance at Open Library doesn't seem to have a data point distinction for this (which is unfortunate), so you may still have to note this, but theoretically it may be possible to get this information from the ISBN directly at some point.
Source for ^^: I read a lot, have a lot of books, briefly ran a (failed) specialized online bookstore, and wrote a CLI tool [4] for myself to solve this very issue.
I build a cli music player that can handle both local mp3s and Spotify tracks in the same playlist, because I wanted that.
I also recently built a tool to help me keep track of my books (what I own, what I want to read, what I've borrowed, etc).
Great read! I also hadn't heard of TheStoryGraph before, so double points.
I'm also very interested in owning my own data (particularly when it comes to things like book lists) and I'm thinking about how to incorporate something like ActivityPub into my CLI book-list application. I'd love to live in a world with an indieweb version of g*dreads.
I'm active in the Secure Scuttlebutt community, which is a decentralization platform that some people from Activity Streams communities might also enjoy. One of the clients has a book review feature which is great and I wish was incorporated by more clients. Something similar to that could work well as a small ActivityPub app.
Very interesting. Which client has the book review feature?
I haven't yet dipped my toes into Secure Scuttlebutt -- I joined a Mastodon instance once upon a time, but it didn't hold my attention too well. I've always associated Secure Scuttlebutt with a similar (albeit more distributed) micro-blogging thing, but that's probably my own ignorance speaking (nothing in the protocol seems to limit interactions to this, based on a quick skim).
Nothing against micro-blogging in particular, I just personally find the signal-to-noise ratio suboptimal (though maybe I'm just on the wrong networks, following the wrong people!). I'd love to build something into my book-tracking app so I could, like, share a list of books I have on my shelf that I would be willing to lend to people, or a list of books I'm interesting in reading that I could borrow from people, or books for trade, or just reviews (I've been doing a bit of pipe dreaming lately). I'm gonna dig a bit more into the Secure Scuttlebutt and ActivityPub protocols to see if this is something that could feasibly be done.
vending machine hacks, although both illegal and immoral, provide an immense source of joy in knowing that you _beat the system_, regardless of what treat may come out.
here's my biggest vending machine hack: freshman year in college, there was a vending machine in our dorm building. it wasn't cheap (obviously), but it got lots of use because it had something extra: a card reader for our IDs. if we pre-loaded our card with money, we could use it to buy sodas, snacks, and washer/dryer cycles. very convenient.
but the vending machine had a quirk: occasionally, for reasons unknown, it would just start spitting out coins. it was a pretty rare occurance (and a very exciting one) that it became known colloquially as "hitting the jackpot". every time we went to the machine, we would cross our fingers, hoping to "win".
while it seemed like a random occurance at first, i knew it couldn't be completely random, and i wanted to figure out _why_ it was happening. so i began investigating. whenever i went to the machine, i would try different combinations of buttons, choosing different rows/columns, but i couldn't recreate the behavior. accompanying my friends to the machine, i paid attention to how they were inputting their order (and if they subsequently "won") to try and figure it out.
after lots of observation, i found a pattern: everyone who ever won used their card to buy something. focusing on the card reader, i also found that these people had accidentally put their card in _incorrectly _ before righting it. with a theory loosely in place, i put some money on my card and gave it a whirl.
and it worked! here's the behavior: if you put your card in incorrectly, the machine couldn't read it because the stripe was on the wrong side, so it spit the card back out and flashed an error on the screen which would clear after a few seconds. while the error was showing, the machine would not accept your card. however, if you put your card in _immediately_ after the error cleared, here's what happened:
1. screen displays the amount of money on your card
2. choose your drink
3. drink is vended while the same amount of money is displayed (i.e. not subtracting the price of the drink.
4. the machine begins spitting out coins in the amount of your card value minus the price of the drink
5. the card is returned with the _original balance_ still intact
so, if i had $20 on my card, and i bought a powerade that costs $1.50, i would walk away from the machine with a powerade, $20 still on my card, and $18.50 in change.
horribly immoral and illegal? absolutely. however, i still feel immensely proud that i not only figured out what was happening, but how to reproduce it.
i mean, it was purposefully acquiring money that didn't exactly belong to me. the Right Thing To Do would have been to report the error and get it corrected, instead of leaving it open for exploit, like when my friends or I needed quarters for laundry.
"learning" something, i'm finding, is very hard to do outside of a school setting, so i've been trying to think of why that is. what i've found is this: in school, you're given assignments, little (or sometimes not-so-little) programs to write that do something. of course, the assignments are designed to illustrate a point, something you've learned recently. but it can serve the same purpose in real life. think of something (doesn't have to be a novel idea), then try to build the functionality of it. it'll be frustrating working through finding bugs and learning how to do xyz, but you'll find that you've learned to use your tools, even if the project doesn't ultimately get finished.
nice read! I wonder how much of it is still relevant today (i.e. gc speed in Java, relative superiority of its library, etc.). OCaml is one of those languages that interests me, but I don't know if I'd ever have a reason to learn it to do something specific, when the languages I've been using are already pretty flexible.
Also (unfortunately, IMHO), this talks a lot about speed, which was a big problem in the Pentium 200 days with minimal RAM, cache, etc. Nowadays, any old poorly written, garbage-collected program runs speedy as hell. Seems like lost is the golden days of optimization.
Point 4 (algebraic data types) is really what's important; especially tagged unions. ML's type constructors map very very well to abstract syntax trees. Java has no equivalent (enums are a distant cousin).
Of course if you are a believer in higher-order abstract syntax (the idea that one should use constructs such as lambda abstractions as part of your syntax tree), Scheme is a better fit, as ML's type system doesn't allow for such wildly typed syntax trees, nor does it allow data-as-code. (I think HOAS is hogwash, but then I'm a firm believer in well-typed code and against data-as-code.)
Of course, if your AST is more a graph than a tree, you're better with a logic language such as Mercury or Prolog. I've had very good experiences writing compilers and interpreters in Mercury.
Many graphical languages (LabView, Simulink) are graphs. Of course you can represent them as trees if you use variable names, but that's just obscuring the fact that it's a tree.
For instance, this appears when you do common subexpression elimination -- if you refer to the same code in two or more places (e.g. call the same function, compute the same math expression), and it's known not to have (or depend on) any side effects, you can compute its value only once and refer to this value in these places. I recall it's covered even in Dragon Book, which is kind of old.
That sounds rather confused, an Abstract Syntax Tree by definition can't be a graph. What you are referring to is a some kind of an intermediate form used by a compiler, the details may very greatly, but for example directed acyclic graphs are often used for common subexpression elimination, but they are most often generated from some other intermediate form (straight-line code) and they have more in common with the code that has to be generated then with the original source code. Of course one can imagine annotating the AST and performing this optimization directly on it, but then it is not an AST anymore.
Uh, there are many papers and examples of using HOAS in statically typed settings, such as ML, Haskell, Twelf, Coq, etc. There is still the issue of exotic terms, but there are known type system tricks for helping with that as well. And the exotic term problem would be even worse in Scheme.
But what of actually parsing a program in HOAS? The idea of HOAS is to translate the text "\x -> x + 5" into the expression Lam (\x -> Op("+", Var x, Const 5)). But how does one translate the string "x" into the variable name x short of data-as-code (or a similar meta-facility)?
The best example I have handy is some Scala code a student I worked with wrote in Scala. I'm not sure if this is necessarily compatible with the latest version of their parser combinator library, but is hopefully suggestive:
import scala.util.parsing.combinator.syntactical.StandardTokenParsers
import scala.util.parsing.input._
import scala.util.parsing.syntax._
object HoasTest extends StandardTokenParsers {
lexical.delimiters ++= List("(", ")", "\\", ".", ":", "=", "->", "+", "{", "}", ",", "*")
lexical.reserved ++= List("Bool", "Nat", "true", "false", "if", "then", "else", "succ",
"pred", "iszero", "let", "in", "fst", "snd")
import lexical.NumericLit
import lexical.Identifier
def Term(base: Parser[Term]): Parser[Term] = positioned(
SimpleTerm(base) ~ SimpleTerm(base) ^^ { case fun~arg => Apply(fun,arg) } // should be left-assoc
| SimpleTerm(base) ~ ("+" ~> SimpleTerm(base)) ^^ { case a~b => Plus(a,b) } // should be left-assoc
| SimpleTerm(base)
| failure("illegal start of term"))
def SimpleTerm(base: Parser[Term]): Parser[Term] = positioned(
(("\\" ~> ident <~ ".") into (x => Parser { in =>
def body(arg: Term): ParseResult[Term] = Term(Identifier(x) ^^^ arg | base)(in)
body(new Term {}) match {
case Success(_,rest) => Success(Lambda(x, (arg) => body(arg).get), rest)
case f => f
}
}))
| "(" ~> Term(base) <~ ")"
| base
| failure("illegal start of simple term"))
def BaseTerm: Parser[Term] = positioned(
numericLit ^^ { case n => Num(n.toInt) }
| failure("unrecognized base term"))
abstract class Term extends Positional
case class Num(n: Int) extends Term
case class Plus(a: Term, b: Term) extends Term
case class Lambda(x: String, body: Term=>Term) extends Term {
override def toString = "\\"+x+".("+body(new Term { override def toString = x }).toString+")"
}
case class Apply(fun: Term, arg: Term) extends Term
}
As a pair of predicates, one for nodes, one for edges:
:- pred asg_nodes(asg, node).
:- mode asg_nodes(in, out) is nondet.
:- mode asg_nodes(in, in) is semidet.
:- pred asg_edges(asg, node, node).
:- mode asg_edges(in, out, out) is nondet.
:- mode asg_edges(in, in, in) is semidet.
Underlying are traditional sets or what-have-you but you never have to deal with these at high levels of coding. You could then do fancy stuff like defining ancestry as:
:- pred asg_ancestor(asg, node, node).
:- mode asg_ancestor(in, out, out) is nondet.
:- mode asg_ancestor(in, in, in) is semidet.
asg_ancestor(ASG, A, A).
asg_ancestor(ASG, A, B) :- asg_edge(A, C), asg_ancestor(C, B).
Of course you'd want to wrap this stuff up in a typeclass or some such and give them useful names.
i understand your frustration. however, like you said, they're on vacation; one of the great things about that is you can relax and be yourself, and if you enjoy hollering at football in a bar, more power to ya!
there are plenty of very smart, dedicated people i know who work hard all day, but once quitting time comes they want sports and beer. they're not a waste of space for indulging; they work hard doing important things, and enjoy themselves when they're done.
sounds like you're frustrated with the overall situation, though (stuck in a foreign town, girlfriend is sick, etc.). you may find yourself more resilient if things weren't so stressful in life!
Favorite feature we've built: "are you my cousin", where you choose 2 people in our database and it calculates how they are related. First time I got to write a search function outside of a classroom!
I'm sure Ancestry does all this and more, but it's nice to have something that we built that's just for our family. We also use it all the time to verify relationships/birthdays!