A chat from opn:p2p-hackers on the attack-resistance of Google PageRank, anonymous remailer networks, and lots of stuff in between. [2002-03-02] in any case, i spent quite a bit of time today analyzing google's attack resistance motivated in large part by this scieno thing ooh, do share gpl + commercial ? like sleepycat GPLing your software allows you to sell it to companies also like ghostscript :) well, has anyone here read the pagerank paper? Yeah, but I assume that things have been greatly changed since then. i don't think so it's hard to say for sure, but the original PR is very smart stuff and my analysis indicates that it has a good attack-resistance property the other thing about PR is that it's quite tunable through the E(u) mechanism E(u) mechanism ? the paper talks about this a little, but obviously they had only started thinking about it if i had to guess, it would be more or less vanilla PR with a well-tuned E(u) .google pagerank e(u) pagerank e(u): http://directory.google.com/Top/Arts/Music/Magazines_and_E-zines/U heh, no you need the 1998 paper by page copy at http://www.levien.com/thesis/refs/pageranksub.ps.gz (url typed from memory, lemme know if it 404s) works, thanks aka http://www-db.stanford.edu/~backrub/pageranksub.ps oh good so i can explain the central concept in a few minutes, if you'd like please ok so the PR algorithm can be modelled as a random walk essentially, what's in the page paper is the following: pick a random web page from your list of 3 billion or whatever urls pick a link at random, and follow it now, roll a 6-sided die if it's 6, stop, otherwise go to "pick a link" so you have a nice exponential tail distribution of walk lengths ok, so now when you stop, mark the url you stop on after a huge number of runs, the number of marks on a url corresponds to its rank it's a beautiful and simple idea E(u) corresponds to the initial step of picking a page at random this _doesn't_ have to be a uniform distribution and therein lies the interesting part :) so, here's my analysis (assuming i haven't put you guys to sleep yet) listening you do the same thing as the analysis of the advogato trust metric * AaronSw fires up MacGhostViewX and skims the paper all web pages are either "good" or "bad" the algorithm doesn't know, only santa claus the idea is to see if you can prove something about bad pages having low rank and good pages a high rank that's the goal of a good attack-resistant algorithm please define 'good' / 'bad' ah, ok that's an interesting question the mathematical part of the analysis doesn't care _how_ you define it so there are a couple of interesting ways of doing it one is by how interesting/relevant the page is this is very subjective, but is still the goal of the thing another is by assuming an attack so, the trick is to define E(u) in such a way so that page rank correlates with your concept of "good" ? so the "bad" pages are the ones under the attacker's control in this case, the question is: how well can the algo resist assigning a high rank to pages under an attacker's control *nod* straight keyword searching, for example, has no attack-resistance all an attacker needs to do is put the juicy terms in the searchable part of the page, and bingo so, the no-attack case is also interesting ie, assuming non-adversarial producers of web pages, how can you get the most relevant results? this is a question that a lot of people have thought about but i'm only really considering the attack case an interesting hypothesis is that an attack resistant ranking algo will give good subjective results in the non-attack case as well, but that would be hard to substantiate :) Interesting. Conventional wisdom would seem to indicate that E is based on how often a page changes and the pagerank of the pages that it links to... ok, forget e for now one of the interesting things is that PR is not hugely sensitive to E AaronSw: yea, i thought pagerank was more complex than this they ran it with a uniform distribution and also a single start page and got results that were surprisingly not all that different i am simplifying a bit but not _that_ much assumed that anyways, back to the attack resistance yes so, as an intermediate step in the analysis, assume that E is only good pages or, more precisely, is 0 for all bad pages k obviously, if you have this assumption, there is a trivially attack resistant ranking algo just pick a page from E :) heh, how resistant is E though ;p but we'll assume that we're doing PR, and it's doing the random walk thing tav: the assumption is 100% ok, so now we further subdivide the good pages into "good" and "confused" a confused page is one that is itself good, but contains a link to a bad page right actually, it might make more sense to talk about the links * tav sees now good links, evil links, and confused links which are g->g, b->whatever, and g->b, resp. so in this framework, the analysis is actually easy any given walk that crosses a confused or evil link can result in a bad page but a walk that crosses only good links results in a good page devilishly simple and elegant so, for a walk of length k, and the probability of a page containing a confused link p, you get a chance of a good link at least p^k (1-p)^k sorry what's more interesting is that it's not just the probability of a random good page containing a confused link, it's biased toward high-ranked pages ? i didn't get that last line ok, lemme spell that out, 'coz it's interesting assume that we have 'god's web' consisting of only the good pages now, we run PR on this dataset, resulting in a rank for each page (actually, as a minor technicality, you want a rank for each walk length) but close enough :) heh, k i mean, there are going to be some "good" pages that are boring and nobody links to and others, such as advogato, that get a huge rank (advogato gets a _lot_ of google-juice) so now, phase back into the real world which contains G's W as a subset (only 7 ;p) so now we can talk about the real-world rank in terms of god's rank g's r being the output of the PR algorithm when run over g's w sure an attack-resistant ranking algo would have bounds ok, so here's the relationship: but, google doesn't have a stale control set afaict? at each step in the walk, you look at the probabiity of a confused set, weighted by the rank at that step confused link, sorry weighted by the g's rank at that step and, for a walk of length k, the final probability of a good link is the product, over step index [0..k) of 1-(confused prob at that step) so, under the assumption that high-ranked pages are less likely to have confused links, you'll actually do better than (1-p)^k hmz shall we look at E now? k... ok, so the effect on the ranking results is clear you just multiply in 1-(the probability that a page in E is bad) ...to the probability that you end up with a good rank huh? so the implications are quit clear ok, a little slower... surely that's (probability of a good rank)^2 ? E now contains both good and bad pages so, choosing a page from E, if it's bad, you get a bad page at the end of the walk if it's good, you get a good page with exactly the same probability as if you had started from god's E so the real-world good prob = E's good prob * the good prob of running PR over god's E right the latter meaning running PR over the real web, starting with god's e so: "how to optimize google" is really clear now E is your whitelist ie, you try to keep bad pages out of it and it can be much smaller than the whole web so, the _other_ tunable parameter is that 1/6 chance of stopping the random walk (the paper gives 15%, but close enough) and now the meaning of that is clear as well any reason for the value? yes the larger it is, the better the attack resistance but the smaller it is, the more of the web it covers ie, with a value of 1, you have exactly e but if e is a whitelist of a few thousand pages, then you're not doing a good job so you assume that most interesting pages are within a few hops of a page in e and the 15% value corresponds to "a few hops" *** dev0 has quit IRC (Read error: 110 (Connection timed out)) ie, at 15%, half of your walks are <= (>) 4 hops i'm sorry, but this is fucking brilliant (google's original idea) heh. i saw the brilliance earlier, but you lost me towards the latter half it's _really_ attack resistant, and it has a nice way of getting a lot of leverage from human input you going to write up your analysis? yes indeedy chapter 6 of ye olde thessys i'd love to read it after reading the original paper and, in around 2019 or so, we'll all be able to implement it (assuming they get a valid patent) i'm fishing for people to tell me that :) * raph finds it difficult to find time for the thesis with all the other interesting bits of work & play so the more motivation, the better :) hehe well, i've always like reading your work *** xena has quit IRC ("you son of a bitch") *** xena (xena@mewtwo.espnow.com) has joined the channel ok, so now i'm thinking about something else which is the relationship between PR and advogato there are a lot of similarities: they both work from the same graph structure (the fact that nodes in advo are people, and nodes in PR are web pages, is irrelevant to the math) *nod* and they both have an attack resistant property ...which is probably quite similar ("resistance", pardon my syntax) ie, the class of attacks resisted by advo and pr are probably about the same but there is a fundamental difference network flow vs random walks (the evaluation of pr is not done with walks, it's done with some fun linear algebra) so advo gives you a yes/no, but it's possible to enrich that by doing several runs which is what master/journeyman/apprentice is all about that gives you a [0..4) range pr, by contrast gives you a real-valued rank perhaps the deepest difference is that pr is deterministic and, even more so, not sensitive to small perturbations two runs of pr on very similar graphs will give you very similar ranks as i'm sure you guys have noticed with actual google searches at different times it's a feature; advo's nondeterminism is a misfeature oh? hmz but it does raise some interesting questions; most notably, how would the algos perform on each other's turf? running pr on the advo graph is very, very very doable the graph is only a few thousand nodes, it would be no sweat to just compute the principal eigenvalue, probably with an off-the-shelf linear algebra package not being overly familiar with either, i feel i'm presently ill equipped to answer that what's your email address? raph@acm.org among others that's my "academic" address running the advo algorithm on google's graph presents a bigger challenge i should join the acm sometime. now, if only i had some money i mean, they have a really impressive supercomputer to implement pr, and no doubt a super-nicely tuned impl the pubs are not that interesting :( * tav mentions the google contest to raph .google programming contest google programming contest google: http://www.google.com/programming-contest yeah, i would win, wouldn't i? ;p most blatantly but the program actually has to run to completion, yes? * raph wonders if he's the only person who sees that as a blatant recruitment scheme recruitment / r&d oh, only 900k pages, that's interesting 'coz advo evaluates 7k pages in about 60s on a modest computer google aren't the first to do a contest (modest by today's standards) the advo impl is in c and pretty well tuned anyways, i'll be sure to email you my thoughts as soon as i clear the pagerank paper i think a 900k dataset might be doable in an overnight run s/7k pages/7k nodes/ nonetheless, you have tempted me * raph has no interest in being recruited by google but this looks easy and fun yah, you'll find it worth reading google gets the ip of the submissions wow, 10k is cheap that's like the cost of filing one patent not an issue for me tho - the advo trust metric is public domain really great chatting w/ you guys about time for bed for me .time pst Mar. 2, 2002 11:13 pm US/Pacific .time gmt Mar. 3, 2002 7:13 am GMT tav: where are you? london (yes, raph, you should write all of this. we want to read it :) arma! start a petition campaign eta on your write up? hey, if it gets ms to open kerberos, it might work on me getting the thesis done anti google patent? heh tav: i'm planning on spending massive time this week my job lets me do that :) yeah, don't do the google programming contest. they're really smart to do it, because it *will *work. so expect it soon excellent. arma: why not? i've been being lame about paying attention to your stuff lately. :( * raph doesn't see the downside ditto :( er, they get the IP, right? * raph doesn't care hum. ok. or, to put it more precisely i'd say do it outside of their parameters. there is no ip involved other than what's already been released in mod_virgule no, wait they would be getting, as far as i can see it, the rights to use the m/ code commercially that i don't care _much_ about, but it's worth _something_ i have a better idea hey, they'll fly you to san francisco if you win! :) i'll post the idea on advo arma: yeah, that's a major draw * raph is only 2 degrees of separation from google top brass anyway erm, 3 advisor's wife's students = founders of the company arma, so what you been up to? is reputation still alive? one never knows in this post.com env reputation is still alive. we will be until june. we've got a bunch of almost-customers, but we've had them for 2 years now. *** coderman has quit IRC (Read error: 110 (Connection timed out)) if any of them turn into actual customers in the next month or two, then we continue until december right * raph wishes you luck with that i know it's tough if several of them convert, then we continue for at least several years, which is Good. i assume that's been taking more attention, and chord/etc less? freehaven we wrote a cute little blurb on reputation and privacy enhancing technologies, for the CFP proceedings http://freehaven.net/~arma/cfp02.html it's got my latest thoughts on free haven, in a broad sense. and has brief summaries of the two recent mix-net reputation papers, too i still haven't read the oceanstore paper. the p2p workshop starts thursday. presenting at financial crypto next monday. need to plan a talk. i've mostly been doing damage control to make sure nothing gets too little attention, fair enough and putting the rest of it into making sure pet2002.org works. hm, the fc one looks interesting not that i've carefully read your earlier ones :) so arma: how carefully did you read this one? (casc-rep) not at all :( o. so it looks interesting from the title then ;) one of the reasons why I'm so excited about this google thing is that it might be the analysis tool i need for gznork exchange rates from the summary, dingbat! gznork is the name for that thing? gznork is my p2p network based on stamp-trading right. that thing we were arguing about a year ago yup the code's been moving forward slowly it was really cool, but had so many holes in it, i couldn't wrap my brain around it i'm thinking of redoing it in python (you can tell i've been spending time with zooko) arma: yes, i would be the only one in the world who can wrap his brain around it, ...if only i could :) hm. so while you're adding things to your list of hard problems, i'll reiterate my problem from earlier ok (the secret to research success is to do the problems you think are easy, but everybody else thinks are hard) in casc-rep, i suggest that we use an advogato-like metric to limit the number of Bad mix participants cha-ching (another cite! :) basically, if the proportion of bad nodes is too high, the adversary can just take over the system right so in my analysis, i'd like to limit the fraction of bad nodes that model is intuitively familiar to me but advogato doesn't let me do that. it just lets me limit the *number*. i need to start talking about distance from seeds, etc, if i want to start thinking about a fraction. you should be able to get to fraction yes distance from seeds is, i believe, essential unfortunately i can say sigma over various distances provides a bound yeah. i haven't thought about this one in a while. hrm. it's interesting to note that the "seed" in advo corresponds almost exactly to E(u) in PageRank my earlier attempt was to come up with a "proof of bandwidth" operation, similar to proof of work, and, in Google's case, that's actually useful with the assumption that the adversary won't be able to pass "too many" proofs of bandwidth at once hmm, interesting but that turned out to be really nasty too, because either you have a central site that 'gets' the packets to verify they all showed up (obviously not a good plan), or you pair people up (or put them in groups of n, etc), yeah, proof of work is much easier and assuming that both work and bandwidth have a fixed cost per unit, and any time most/all of the group is bad, then they can lie. you really get the same result in terms of costs they're different assumptions tho yeah i eventually decided that if my adversary was the nsa, and my 'good' nodes are volunteers, then i really better stop looking down that path. yes because cost is not a big problem for them * raph tries to come up with a pun on "cost plus" but n/m * arma doesn't get out much :) ok, so in the remailer network it's all source-routed, right? in the current one, yes (this is certainly the case for the classic mix) in casc-rep, we build cascades out of participating nodes oh that is, we define fixed routes, and traffic goes through those routes it helps the intersection attack tremendously ok, i need to read the paper then yes in some sense it's still source-routed, though, * raph added random path choosing to premail at the request of cpunks because if we're trying to get odds of a bad path down to one in a million, and i always had the feeling that it _decreased_ security then people will need to chain cascades it sort of got fuzzy at that point but obviously if your path is only 4 or 5 hops, you're not going to get that level of security with any non-trivial adversary * raph thinks ? i mean, any mix-like thing is still going to have the problem that if the adv can monitor bandwidth on entry and exit to the net, the intersection attack is going to work assuming detectable traffic patterns, which i think is a _very_ good assumption in r/l so, to me, this places an upper bound on how good you need the mix itself to be i handwave some about approaches for having the cascades themselves send dummy messages. ok i think it could actually work so... let's consider a simpler network plain old mixes basically the cascades need to be sending test messages anyway, plain old source routing and they simply address them to/from people who've previously used the cascade ok. plain mix net. * raph is not convinced by dummy messages _at all_ at all? i think they raise the bar.. not very much i get the feeling they raise it from the equivalent of 32 crypto to the equivalent of 40 bit crypto fair enough so. plain mix net. yes so the application of an advo-like tm is _really_ easy to see the remops themselves do peer certs ... and in this case, the cert basically means "cert subject is not nsa" right right. it is *not* based on observed performance yeah, so observed performance is interesting you can also do that the same way if the net is going to remain fairly small, then you just do all-to-all pinging with each note publishing its own list well, the premise of casc-rep is a bit different: the client just retrieves all n^2 values, and evaluates those according to trust my goal is to allow more dynamic participants, right so random people can volunteer for a day, not volunteer the next, etc i want all sorts of people to be able to join, and the trust metric needs to handle this so you're really building a full p2p network :) so the trust graph should be reasonably stable even if you have join/leave (right. it's possible i'm just nuts and the real reason the network is small is because nobody wants to be an exit relay....but i'll solve that down the road.) well, if we're talking about the real remailer net, the deeper problem is that port 25 sucks if you had a spam-resistant messaging system, a pseudonymity service would be quite valuable but that's a different story... let's keep focussed on trust so if you do source routing, you're very dependent on your choice of seed in fact, that becomes your killer critical problem right it's likely that the "trustworthy seed" problem has no technological solution only (perhaps) social ones speaking of, i worry that needing to play the trust network game will seriously slow down the dynamic part of the network you can't just grab the rpm and go * raph notes that the recent morpheus problem can be seen as a seed failure you need to go out for a beer with somebody, etc. yes, you really don't have throwaway participation in a trust network and i think that will keep the list of partipants small. to me, the answer to that problem is clear: you offer long-term valuable services to those who do participate oo. another question. you know how seti@home gets people to do it because of the rankings? is there a comparable thing here? rankings meaning amount of cycles people are burning? right. "my team is #3" aha that's an interesting q primarily social nobody would do seti@home if there were no stats page. that's a social thing. but it's critical. right it's like the peacock's tail it has a very high cost, but _showing_ that you can afford that cost is valuable for getting peafucked interesting. (it's a standard model in evo psych, if you've been following that) i'd followed them separately, but hadn't tried connecting them. * raph spends too much of his time thinking about this stuff :) hey, i'm going to be in SF apr 13 through 20ish you going to be around? think so great i mean, almost certainly, for at least a subset of that time * raph frequently has 1 or 2 day obligations like staff meetings etc) i'm doing two workshops and a conference while i'm there so let's get together fer sherr but i'll have an evening free sometime. *** coderman (~fibrill@12-224-237-178.client.attbi.com) has joined the channel yes, i'd go out of my way to insure that too bad it's before the google contest 'coz otherwise we could get them to fly me to sf :) lol, you in the google contest raph? so i can't talk you into going to pet2002? it'll be fun :) maybe one day it's a sunday and monday i dunno, look at the list of accepted papers and you'll decide either way so back to casc-rep yes there was another really interesting thing we noticed when working on it the idea is to group mixes into cascades, ok, so tell me what a cascade is and if anybody in the cascade screws up, they all suffer in 5 words or less a fixed-order set of mixes. it's a path. lots of people use it. so it's a data structure? an ordered list of nodes? it's the statement "use A first, then B, then C, then D." ok all the traffic goes into A, from there to B, ... so, basically, you're still source-routing, but you expect many sources to choose the same route and each of them reorders it reordering in the classic mix sense, yes? right. there are a limited number of routes people can pick right. batching/permuting/etc. ok, i think i get it so the advantage is clear so the deal there is that the nodes in a cascade watch each other, and if one of them says the cascade failed, it did. aha all nodes in a failed cascade lose a reputation point that's cool all nodes in a non-failed cascade (they last a day) gain a reputation point now the *problem* here, it makes it v. difficult for a node to lie about being reliable (right) is that the adversary's strategy is to make a bunch of nodes, get them put around into various cascades, and fail when it's advantageous. for instance, if cascades are size 4, right i get it then he will fail cascades where he has only one node, and not fail others. * raph is fast at this kind of analysis :) so with only a small percentage of nodes, he can move his nodes anywhere reputation-wise. right suck. :) we kludge it in casc-rep with two things: a) a web of trust to limit the number of bad nodes that won't work because a small number of bad nodes is a problem b) we pick cascades such that even if the adversary put *all* of his nodes into the range from which you're picking the 4 nodes, it's still within acceptable safety bounds. you only need |cascade| well, if you have only |cascade|, ok, b is more plausible then you haven't broken any anonymity. really? that was a very fuzzy statement. it needs refining. read the paper. ;) presumably, the goal of the attacker is to create a cascade of all-bad nodes i will i promise if you have an all-bad cascade, the attacker wins, yes? yes. ok so the non-independence is interesting (it can be complicated into an all-bad path rather than just all-bad cascade, if users can chain cascades. but we can ignore that for now.) and you're assuming that cascade/path length never gets _very_ long right? well, we posit paths of around 12 hops, in the paper. i was gonna say, above 10, assuring reliability is a serious problem well, isn't that based on current mix-nets? yes, true in theory, cascades that aren't reliable will die early so if you use 3 of the functional cascades, and they stay functional, you're fine if they don't, future people are more likely to be ok. etc. ok, but we're talking about a relatively small constant, i think right. 12 is still small enough that it's realistic for an attacker to get that # accepted yes. so the interesting thing, to me, but the chance we'll build a path out of only his nodes is, well, a parameter we can work with. is that you win by making the path more heterogenous http://freehaven.net/doc/casc-rep/casc-rep.ps btw. hm. i should commit the newest version, shouldn't it. ie, one node with an aclu affiliation, another with cia, another with swiss govt, etc true. but we'll ignore that yeah. that's hard to quantify. so, i think you might want to kiss in terms of trust * raph thinks ok, this is just off the top of my head but having the client evaluate a simple advo trust metric on the graph (which is published) seems like the right thing so you use the reputations for reliability only correct and the trust metric for trust only yes aha, here's a thought if you don't trust a node, you ignore all reputation ++'s and --'s that that node participated in hmm, no, that's problematic how so? the probabilities go the wrong way as lengths scale true well, the *path* length goes up to 12, but the cascade length remains fixed at 4 or 5 ah so it could still work that's better we left the cascade length low for that reason obviously, i need to read the paper so a single bad node would have limited influence on how many others it could take down with it. you've convinced me it's worth reading :) in the extreme case, it's simply free-route mixing with a buddy system (mixes come in pairs which watch each other) mm-hmm http://freehaven.net/doc/casc-rep/casc-rep.ps. the current one should be fine to read. we just finished the preproceedings copy, but it's not much different. wgotten skip section 4 on your first reading, if you like. fyi, gznork is completely non-anonymous that was one of my first reaalizations i couldn't solve double-spending without full traceability even though it basically grew out of my thinking about how to create an anonymous system it won't even be a good system for pirating _music_, much less terrorist planning what'll it be for? because if the riaa wants to be super-nasty and issue subpoenas, the (publicly readable) trust graph will tell them exactly who to go after even if there is a provision for pseudonymous nyms well, people will use it for that anyway, even though it's not a good idea :) i'm kinda thinking one of the biggest legit uses is free software isos and the like ah. so software redundancy and availability that was cfs's stated goal too and i think tangler's? right * raph is not familiar with tangler really? o, you should look at it sometime (in your copious free time) is this one of your things * raph recalls you talking about it no, david mazieres and marc waldman, nyu ah .google tangler nyu tangler nyu: http://www.cs.nyu.edu/~waldman anyway, it's not as important as other things but it's along the lines of free haven, but slightly more feasible. aha so, basically gznork is this: an underlying stamp-trading network the goal of that is to be able to discover trading partners ok and to enable trading with those, and shut out everyone else the pattern of trading partners is chord-like on top of that, i see implementing about 5 services the first is vanilla bulk file transfer and the reason why it's the first is that it's relatively easy ie, you're not critically dependent on trust or, to be more precise, you don't lose much. yea trust is still needed to keep the underlying network healthy but it's not needed at all to provide the service itself (we can assume that it maps some kind of hash to the file itself, avoiding file naming issues) so it's relatively easy, and also something there's a high demand for o right, you've got this nasty issue of actually deploying it and making people want to run it. that's a hard enough problem in its own right. :) well, that's not my main concern, actually :) but yeah so the other services are harder #2 is probably email and there we have a number of interesting goals primarily spam-resistance right but i'll throw a couple of other things in for good measure, like transparent pk encryption and reliability * arma chants 'pki' that's #3 a name server, pretty much exactly the same as my thesis design but an important point is that #2 will run without a name service * raph chants 'ssh' uh.. huh? ssh works without a pki what's a pki? what's a name service? ok, i have no idea what a pki is the name service is much easier or rather, how the (*# can you do email without naming? it's a namespace global to gznork, first-come, first-serve policy ok. hashes of public keys :) hm. which you need to instantiate the trust graph ok. so like pgp then. there will be several keys which claim to be arma@mit.edu, so a big part of gznork is making those hashes pretty and accessible and you'll pick the one whose trust characteristics lead you to believe it's me. actually, no good. then how? ok in the non-pki case, i'll just pick 4ab3 776a 1253 based on you having given me a beautifully printed card with that information, last time we had a beer together business card pki. ok. or, perhaps, the other way around, and me responding to an email you sent me with the name service, it's a bit better gznork enforces the 1st-come, 1st-serve policy so you tell me that your gznork name is "arma", and i'll believe you if somebody else took "arma" first, then you'll tell me it's "arma2" or whatever then, i just type in "arma", and it resolves the key right. and lying gets me nowhere, and, when you send me mail, it says "arma" because either i have the private key or not. yeah, the issue is not lying, it's being fooled but my feeling is that people can _understand_ this as opposed to just about everything else that's said in pki-land ah. the usability ogre. yes, gznork is as much about usability as science they'll understand that when mail arrives from arma2, with roger's name at the bottom, it'll be clear that it might not be from roger? hrm. so, unfortunately, the one area where this falls down is trademarks arma: some people will be able to understand that the remainder will never, ever, ever, have the opportunity to experience secure email others will get burned. i guess we can't win all the battles at once. right. i imagine that the first thing that will happen when gznork catches on is for somebody to register keys for all the trademarks cause hey, registration is free, just like the old internic this may cause a problem if you get an email from "wells fargo bank" actually, it would be better if registration cost $10 or so, payable to me, but i can't figure out how to do that and make the network decentralized, attack-resistant, and so on you could make registration free but you can send a $10 and trump an address. if you don't want your address trumped, you better pay first. right evil. but anyway. :) hey, you could even have multi-tier trumping so, yeah, probably the right thing to do is have hierarchical name spaces, and brands so if you really don't want it trumped, you pay $1000. etc. "how much is it worth to you to become wells fargo bank?" interesting concept, but i find it a bit distasteful probably just too reminiscent of the current dns policy :) heh so, i create a new brand, say "faligent" because it has to not be an existing trademark :) and i promote the brand as being tm-friendly so "faligent.coke" is coke, not a cocaine dealer "faligent.ford" is not a hitchhiker's fansite etc and you need to know faligent's key to generate a faligent.* name? faligent sells them and other people can't forge them, i mean ah, how do you enforce the policy? there are a few ways to do that the most dnssec-like is to have clients do the verification so you look up "faligent" in your global namespace, then verify the chain from there ie, faligent signs the coke cert, etc so faligent.coke is effectively 'signed' by faligent ok but there are other interesting ways of doing it this will be chapter 5 of ye olde thessys there's a crypto way on the tip of my mind ? basically hashes and stuff can't generate a faligent.foo without faligent's key as part of the process hmm erm. would have to get some sleep for it first. but i'm confident it can be done. (and has been done) intriguing that's faligent's private key, yes? right. i'd probably start looking at sfs first, to try to remind myself. anyway, i'm not that concerned about that are you familiar with sfs? only vaguely the more challenging problem is the root namespace ok. if you're not concerned, i'll just dump that topic and move on. :) i think if you can solve that, you can get hierarch delegation for nearly-free ok, so services #4 and #5 #4 is firefly but not sucky like firefly this was going to be chapter 6 of my thesis but ch6 is turning into an analysis of google and #5 is perhaps the most fun it's a mud more specifically, you have objects for "things", "rooms", and "avatars" and you implement semantics for them, like a thing can be inside a room have you done any inform/zcode hacking? no, but i wrote a mud a while back righto it's how i learned c. that must have been... 10 years ago? so doing it distributed is really challenging right in fact, basically what you want is transactions yep ie, if you move a thing from room a to room b, then you want that to be atomic so doing it as a mud is a way to keep it fun hey, that reminds me. hm. i'm sure that financial transactions would be equally valid but that'll be for somebody else btw, have you looked at miguel castro's work under liskov on byzantine transactions? yes ah, good i _think_ that can be generalized to the trust-metric world that would be neat. it's hard stuff, from my perspective the problem is that not all nodes have the same view of what the other nodes are i haven't looked at it enough to be able to extend it castro's a smart guy. but there's an intuition why i believe it should be possible consider the set of all nodes trusted by the good set (ie that includes some bad ones) the good set ? ok nodes are good or bad each node trusts a set of other nodes ok. sounds good. in the classic byz model, the trust set is the complete set now we have defined good, bad, confused and is the same for all nodes and you win as long as 2/3 of the nodes in the set are good so my intuition is to consider any node which is trusted by only some of the good nodes as bad and if you still have a 2/3 majority, i think you still win but i'm not sure as you say, byzatine transaction foo is hard (it's interesting how many ppl i know who got into hardcore programming through implementing a mud) (at least one programming languages guy at berkeley, and several others i know through free software) hey, bbs, play mud, run mud, move on it's how we all did it, right? not me but i'm wierd ok, i really need to get to bed now http://mit.edu/arma/Public/6.313f.tex is a paper i wrote for tom knight ~4 years ago, on implementing a distributed mud anyway. :) yes, sleep. and read casc-rep and tell me if you buy it. yes (wow. i hadn't thought of that as being p2p. but it needs to be.) i probably won't buy casc-rep, that is (the mud thing, i mean) ok. good. :) but this is high praise there is no other work i've seen on anonymity that rates that high oh cool, you really have thought about this stuff hm? no wonder you're at least familiar with castro i used castro's byzantine toolkit in the submission to info hiding 4 then we found a better way to solve our problem (that submission is [7] in casc-rep) aha (er, turned into [7], that is. it changed a *lot* between acceptance and preproceedings.) ok, will probably read casc-rep tomorrow those were the days. we wrote that paper in 10 days, unless i get completely caught up in writing from deciding to write a paper, to finding a topic, to solving it, to submitting the paper. what did you mean 'rates this high'? in your eyes, or is there some global chart i should know about? in my eyes :) (damn these decentralized approaches ;) there's quite a bit of work on anonymity that's utter crap but i don't have to tell you this s/work/"work"/ :) yeah. we're on the right track. we're still partially crap, but i'm coming to understand this, so hopefully i can do better next time. yup this is why i do graphics the problems are soooo much easier ...and then realize that *that* is crap, and do better the next, ... and it's much easier to convince people that you've solved them :) ok now bed hope to talk to you again soon! good night. *** raph has quit IRC ("sleep") yep. you too raph always distracts me so well. i've gotten nothing done the past hour. foo.