/* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % % FUZZ TUTORIAL 0.1 % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /* The tutorial is composed by several examples showing how to program in Fuzz. The examples show in an incremental way how to use the different program constructions to ensure differentially private queries. The examples are followed by two libraries, one for functions working on list and one for functions working on bags. Finally, at the end you can find a summary of the Fuzz syntax. All the examples composing the tutorial are also present in the current Fuzz distribution. For more information on Fuzz, its implementation and its type system please refer to the ICFP10 and USENIX11 papers. */ /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Fuzz tutorial - 1 - % %Query asking for the number of citizens over 40 present in % %the census database % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ //Include a library for bag functions containing the function bagfilter used below #include "library-bags.fz" //census format line: age | sex | race | union | zipcode | income | ... | ... typedef censusline = (num,(string,(string,(string,(num,(num,(num,num))))))); //over_40 : censusline -> bool //The test function has an infinite sensitivity function over_40 (line:censusline) : bool { let (age,aage) = line; 40 < age } //over_40_query : censusline bag -o fuzzy num //The function query has sensitivity 1epsilon as witnessed by its type //The argument 50.0 of the bagfilter function is the number ot steps //we allow the function run. //The primitive 'fuzz' adds Laplace noise L_{1/epsilon} before returning the result. function over_40_query (db:[] censusline bag) : fuzzy num { fuzz (bagsize (bagfilter over_40 50.0 db)) } //main : censusline bag -o fuzzy string //The function main is just used to print the message, note however that it should //preserve the sensitivity. This is done by using the 'sample' let construction // and the monadic 'return' to deal with the probabilistic monad function main (db:[] censusline bag) : fuzzy string { sample numOver40 = over_40_query db; msg = " The number of citizens above 40s is : " + (num2string(numOver40)) + "\n"; return msg } //The function main is called main /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Fuzz tutorial - 2 - % %Query asking for both the average income of male and female % %in the census database. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /* Fuzz tutorial - 2 - Query asking for both the average income of male and female in the census database. */ //census format line: age | sex | race | union | zipcode | income typedef censusline = (num,(string,(string,(string,(num,(num,(num,num))))))); //In order to respect function sensitivity a limit on the possible income is needed. //As a concrete instance we limit it at $100,000. #define LIMIT 100000.0 //is_male : censusline -> clipped //As expressed by its type the function is_male returns a //result clipped in the interval [0,1] function is_male (line : censusline) : clipped { let (age, aage) = line; let (sex, asex) = aage; clip (if (sex == "M") then { 1.0 } else { 0.0 }) } //is_female : censusline -> clipped function is_female (line : censusline) : clipped { let (age, aage) = line; let (sex, asex) = aage; clip (if (sex == "F") then { 1.0 } else { 0.0 }) } //male_income : censusline -> clipped //Again, the function male_income returns a result clipped //in the interval [0,1]. This time, in order to obtain this //we use the limit we have imposed. function male_income (line : censusline) : clipped { let (age, aage) = line; let (sex, asex) = aage; let (race, arace) = asex; let (union, aunion) = arace; let (zipcode, azip) = aunion; let (income, aincome) = azip; clip (if (sex == "M") then { income / LIMIT } else { 0.0 }) } //female_income : censusline -> clipped function female_income (line : censusline) : clipped { let (age, aage) = line; let (sex, asex) = aage; let (race, arace) = asex; let (union, aunion) = arace; let (zipcode, azip) = aunion; let (income, aincome) = azip; clip (if (sex == "F") then { income / LIMIT } else { 0.0 }) } //main : !_4 censusline bag -o fuzzy string //The function main is used to test the other functions. //The different functions are composed in a sequential way, //so the resulting function has the sensitivity equal to the sum of //the sensitivity of its components, i.e. in this specific instance 4. //The instaces of the primitive 'fuzz' add Laplace noise, the instances //of the 'sample' let construction deal with the probabilistic monad in order //to print the result. function main (db:[4] censusline bag) : fuzzy string { sample numMale = fuzz (bagsum (bagmap is_male (clip 0.0) 45.0 db)); sample numFemale = fuzz (bagsum (bagmap is_female (clip 0.0) 42.0 db)); sample totMaleIncome = fuzz (bagsum (bagmap male_income (clip 0.0) 43.0 db)); sample totFemaleIncome = fuzz (bagsum (bagmap female_income (clip 0.0) 44.0 db)); msg = " Avg Male Income: " + (num2string(LIMIT * totMaleIncome / numMale)) + "\n" + " Avg Female Income: " + (num2string(LIMIT * totFemaleIncome / numFemale)) + "\n"; return msg } main /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Fuzz tutorial - 3 - % %Query computing a histogram of the ages of citizens in the % %census database. % %The intervals for the histogram are of 10 years, starting % %from 0 to 100. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ //Include a library for list functions #include "library-lists.fz" //census format line: age | sex | race | union | zipcode | income | ... | ... typedef censusline = (num,(string,(string,(string,(num,(num,(num,num))))))); //hist_aux : num -> num bag -o list(num bag) function hist_aux (c: num) (s:[] num bag) : list(num bag) { if (c<101) then { let (y,n) = (bagsplit (fun (z : num) : bool {z num function find_age (line : censusline) : num { let (age, aage) = line; age } //hist_query_aux : censusline bag -o list(fuzzy num) function hist_query_aux (db :[] censusline bag) : list(fuzzy num) { map fuzz (hist (bagmap find_age 0 50 db)) } //seq : list(fuzzy num) -o fuzzy list(num) //The function seq is the monadic sequencing. //Note the use of the 'sample' let binder and of the monadic 'return' to change the //type of the list. function seq (il :[] list(fuzzy num)) : fuzzy list(num) { case unfold il { inl(y) => return fold[list(num)](inl[inner(num)](())) |inr(y) => let (h,tl) = y; sample z=h; sample u= (seq tl) ; return fold[list(num)](inr[inner(num)](z,u)) } } //hist_query : censusline bag -o fuzzy list(num) function hist_query (db :[] censusline bag) : fuzzy list(num) { seq (hist_query_aux db) } //printer: list(num) -> string function printer (il : list(num) ) : string { case unfold il { inl(y) => " " |inr(y) => let (h,tl) = y; " "+ (num2string h) + " " + (printer tl) } } //main: censusline bag -> fuzzy string function main (db :[] censusline bag) : fuzzy string { sample histogram = hist_query db; msg = " Histogram: " + (printer histogram) + "\n"; return msg } main /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Fuzz tutorial - 4 - % %Query computing a histogram of the number of web requests % %that came from specific subnets present in an Apache web % %server log database. % %The query is tested on the ip address having as a % %prefix 66.249.71 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ //Include a library for list functions #include "library-lists.fz" //ip format as standard typedef ip = (num,(num,(num,(num,())))); //logline containing an ip and an identifier typedef logline = (ip,num); //webserverlog_list : list(logline->bool) -> logline bag -o fuzzy list(num) //Function computing the actual histogram. //Note the repeteaded use of the 'sample' let binder and of the monadic 'return' //to deal with the fuzzy elements function webserverlog_list (preds : list(logline->bool)) (arg :[] logline bag) : fuzzy list(num) { let (predcounts, rest) = (((pubfoldl (fun (ac :[] (fuzzy list(num), logline bag)) { fun (curpred : logline->bool) : (fuzzy list(num),logline bag) { let (thelist,thebag) = ac; let (is, isnot) = ((bagsplit curpred 105.0) thebag); sample issize = (fuzz (bagsize is)); sample sampledlist = thelist; (return cons(num,issize,sampledlist),isnot) }} )) (return nil(num), arg)) preds); sample catchnum = (fuzz (bagsize rest)); sample sampledpredcounts = predcounts; return cons(num,catchnum,sampledpredcounts) } //q : logline -> bool //function testing if a logline starts with 66.249.71 function q (line : logline) : bool { let (ipaddr, code) = line; let (ip1,ip234) = ipaddr; let (ip2,ip34) = ip234; let (ip3,ip4) = ip34; ip1 == 66 && ip2 == 249 && ip3 == 71 //ip locations looking for } //preds : list(logline -> bool) //list of tests, in our case it contains only the function q above preds = cons(logline -> bool, q ,nil(logline -> bool)); //main : logline bag -o fuzzy string //The function main is used to print the message in a sensitivity preserving way function main (arg:[] logline bag) : fuzzy string { sample result = webserverlog_list preds arg; msg = " Records Matching IP Query: " + (num2string(head(tail(result)))) + "\n" + "Records Not Matching IP Query: " + (num2string(head(result))) + "\n"; return msg } main /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Fuzz tutorial - 5 - % %Kmeans- Function clustering a set of points and returning % %the three cluster centers % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ #include "library-lists.fz" //Type definitions typedef point = (num, num); typedef pointlist = mu X => (() + (point,X)); typedef labpointbag = (point,num) bag; // labelled point bag typedef pointbag = (num, num) bag; //totx : pointbag -o fuzzy num function totx (db :[] pointbag) : fuzzy num { fuzz (bagsum ((bagmap (fun (p:point) : clipped {let(x,y)=p; clip x})) (clip 0.5) 28.0 db)) } //toty : pointbag -o fuzzy num function toty (db :[] pointbag) : fuzzy num { fuzz (bagsum ((bagmap (fun (p:point) : clipped {let(x,y)=p; clip y})) (clip 0.5) 27.0 db)) } //tot : pointbag -o fuzzy num function tot (db :[] pointbag) : fuzzy num { fuzz (bagsum ((bagmap (fun (p:point) : clipped { clip 1 })) (clip 0.5) 33.0 db)) } //zip : list(A) -> list(B) -> list((A,B)) //zip [1,2,3] ["a","b","c"] = [(1,"a"),(2,"b"),(3,"c")] //zip [1,2] [9,8,7,6,5,4,3] = [(1,9),(2,8)] //zip [9,8,7,6,5,4,3] [1,2] = [(9,1),(8,2)] function zip (l1 : list(A)) (l2 : list(B)) : list((A,B)) { case unfold l1 { inl(x) => NIL((A,B)) | inr(x) => case unfold l2 { inl(y) => NIL((A,B)) | inr(y) => let (xh,xt) = x; let (yh,yt) = y; CONS((A,B),(xh,yh),((zip xt) yt)) } } } //avg : (point,num) -o (num,num) function avg (pi : (point, num)) : (num,num) { let (p, i) = pi; let (x, y) = p; (x / i, y / i) } // sqdist : point -> point -> num // Returns the squared distance between two points function sqdist (p1 : point) (p2 : point) : num { let (x1,y1) = p1; let (x2,y2) = p2; dx = x2 - x1; dy = y2 - y1; dx * dx + dy * dy } // argminaux : num -> num -> num -> list(num) -> num // auxiliary function for argmin. best is the least value seen so far, // bestIndex is its index in the list. curIndex is the current index, // x is the list of elements left to consider. function argminaux (best : num) (bestIndex : num) (curIndex:num) (x : list(num)) : num { case unfold x { inl(y) => bestIndex | inr(y) => let (h,tl) = y; if (h < best) then { argminaux h curIndex (curIndex+1) tl } else { argminaux best bestIndex (curIndex+1) tl } } } // argmin : list -> int // Returns the index of the smallest element in the list function argmin (x : list(num)) : num { argminaux 1000000 0 0 x } //assign : pointlist -> pointbag -o ((point,num) bag) function assign (means : pointlist) (db :[] pointbag) : ((point,num) bag) { (bagmap (fun (pt : point) : (point, num) { sqdists = pubmap (fun (pt2 : point) : num { sqdist pt pt2 }) means; (pt, argmin sqdists) })) ((0.0, 0.0), 0.0) 45.0 db } //partition : labpointbag -o num -> num -> list((point, num) bag) function partition (ldb :[] labpointbag) (m:num) (n:num) : list((point, num) bag) { if (m == n) then { NIL((point,num) bag) } else { let (yes, no) = ((bagsplit (fun (x:(point,num)) : bool { let (pt,i)=x; i == m })) 26.0 ldb); CONS((point,num) bag, yes, partition no (m+1) n) } } //iterate : !_3 pointbag -o pointlist -> fuzzy pointlist //Note that the sensitivity of iterate is 3epsilon function iterate (db :[3] pointbag) (means : pointlist) : fuzzy pointlist { ldb = ((assign means) db); p = (((partition ldb) 0) (length means)); db2 = ((map (bagmap (fun (x:(point,num)) : point { let (pt,n) = x; pt} ) (0.0, 0.0) 29.0 )) p); sample tx = listfuzz (map totx db2); sample ty = listfuzz (map toty db2); sample t = listfuzz (map tot db2); info = zip (zip tx ty) t; return (pubmap avg info) } //initMeans : () -> pointlist function initMeans (x : ()) : pointlist { point1 = (0.1, 0.1); point2 = (0.2, 0.3); point3 = (0.5, 0.9); CONS(point, point1, CONS(point, point2, CONS(point, point3, NIL(point)))) } //report : pointlist -> string function report (il : pointlist) : string { case unfold il { inl(u) => "" |inr(p) => let (elem,rest) = p; let (x, y) = elem; "(" + (num2string x) + ", " + (num2string y) + ")\n" + (report rest) } } // !_6 pointbag -o fuzzy string //Note that this function since use iterate twice becomes 6epsilon sensitive fun (db :[6] pointbag) : fuzzy string { x = (initMeans ()); sample foo = iterate db x; sample foo2 = iterate db foo; return (report (foo2)) } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % % LIBRARIES % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Libraries for functions working on lists % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /* List type definitions */ #define list(A) (mu XX => (() + (A,XX))) #define flist(A) (mu XX => (() + (fuzzy A,XX))) #define inner(A) (() + (A, list(A))) #define finner(A) (() + (fuzzy A, flist(A))) #define NIL(A) fold[list(A)] (inl[inner(A)] ()) #define CONS(A,a,b) fold[list(A)] (inr[inner(A)] (a,b)) #define FNIL(A) fold[flist(A)] (inl[finner(A)] ()) #define FCONS(A,a,b) fold[flist(A)] (inr[finner(A)] (a,b)) #define cons(X,a,b) (fold[list(X)] (inr[inner(X)] (a,b))) #define nil(A) fold[list(A)] (inl[inner(A)] ()) //pubmap : (X -> Y) -> list(X) -> list(Y) function pubmap (f : X -> Y) (l : list(X)) : list(Y) { fold[list(Y)] case unfold l { inl(x) => inl[inner(Y)](()) | inr(x) => inr[inner(Y)](let (h,t) = x; ((f h),(pubmap f) t)) } } //pubfold : (B -o (A -> B)) -> B -o list(A) -> B function pubfoldl (f : (B -o (A -> B))) (ac :[] B) (il : list(A)) : B { case unfold il { inl(y) => ac |inr(y) => let (elem,rest) = y; ((pubfoldl f) ((f ac) elem)) rest } } //head : list(num) -> num /* returns 0 for empty list of nums */ function head (il : list(num)) : num { case unfold il { inl(y) => 0 |inr(y) => let (elem,rest) = y; elem } } //tail : list(A) -> list(A) /* returns nil for empty list */ function tail (il : list(A)) : list(A) { case unfold il { inl(y) => nil(A) |inr(y) => let (elem,rest) = y; rest } } //map : (X -o Y) -> list(X) -o list(Y) function map (f : X -o Y) (l :[] list(X)) : list(Y) { fold[list(Y)] case unfold l { inl(x) => inl[inner(Y)](()) | inr(x) => inr[inner(Y)](let (h,t) = x; (f h,map f t)) } } //listfuzz : flist(nume) -o fuzzy list(num) function listfuzz (fl :[] flist(num)) : fuzzy list(num) { case unfold fl { inl(x) => return NIL(num) | inr(x) => let (lhead, ltail) = x; sample y = lhead; sample tailsample = listfuzz ltail; return CONS(num, y, tailsample) } } //length : list(A) -> num function length (x: list(A)) : num { ((pubfoldl (fun (y:[]num) { fun (z:A) : num { y+1 } })) 0) x } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Libraries for functions working on bags % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ /*Several functions dealing with bags are built-in in Fuzz. Here we recall their type signatures and their use. For a precise definition see the ICFP10 paper. bagsize : A bag -o num Given a bag it returns the number of elements in it bagsplit : (A -> bool) -> num -> A bag -o ( A bag x A bag ) Splits a bag in two bags: one containing the elements satisfying the test (the first argument), the other the elements that don't pass the test. The second argument (of type 'num') is the number ot steps we allow the function run. bagsum : (num bag) -o num Returns the sum of the elements (clipped to the interval [-1,1]) of the input bag. bagmap : (A -> B) -> num -> A bag -o B bag This is the straightforward adaptation of the classical map function on lists to the case of bags. The second argument (of type 'num') is the number ot steps we allow the function run. In the following we add some other useful functions. */ //bagfilter : (A -> Bool) -> num -> A bag -o A bag§ //Returns a bag containing the elements of the input bag passing the //test (the first argument). function bagfilter (test: A -> Bool) (count: num) (b:[] A bag) : A bag { let (yes,no) = bagsplit test count b;yes } /* %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % % FUZZ LANGUAGE SYNTAX % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% */ R ::= floats REALS r ::= SENSITIVITY ANNOTATIONS R+ concrete sensitivity (strictly positive) inf infinite sensitivity T ::= TYPES () single-valued unit type num "real" numbers int integers bool booleans clipped numbers whose absolute value is at most 1 T set set T bag multiset (T,T) tensor (|T,T|) with T + T sum fuzzy T probability distributions over T T -o[r] T r-sensitive function T -o T 1-sensitive function T -> T ordinary function X type variable mu X => T recursive type e ::= EXPRESSIONS () value of void type R numbers c constant e op e binary functions (==,<=,!) (e,e) tensor-pairs let (x,y) = e; e tensor destructor (|e,e|) with-pairs fst e with-pair destructor snd e with-pair destructor inl[T] e injection into with-pair inr[T] e injection into with-pair case e {inl(x) => e | inr(x) => e} case analysis fuzz t adding noise sample x=e; e sampled let binding return e monadic return fun (x:T) {e} ordinary function fun (x:[r]T) {e} r-sensitive function fun (x:[]T) {e} 1-sensitive function function x (y:T) (z:T) ... : t {e} e function definition with multiple arguments e e application fold[T] e fold up a recursive type unfold e unfold a recursive type x = e; e let-binding x : t = e; e let-binding with type assertion typedef x = t; e type definition if e then {e} else {e} conditional