Skip to content

Commit 71b1829

Browse files
committed
#26 Update documentation
1 parent 4aac280 commit 71b1829

File tree

1 file changed

+44
-6
lines changed

1 file changed

+44
-6
lines changed

README.md

Lines changed: 44 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,62 @@
11
# RL anonymity
22

3-
An experimental effort to use reinforcement learning techniques for data anonymity.
3+
An experimental effort to use reinforcement learning techniques for data anonymization.
44

55
## Conceptual overview
66

7+
The term data anonymization refers to techiniques that can be applied on a given dataset, D, such that after
8+
the latter has been submitted to such techniques, it makes it difficult for a third party to identify or infer the existence
9+
of specific individuals in D. Anonymization techniques, typically result into some sort of distortion
10+
of the original dataset. This means that in order to maintain some utility of the transformed dataset, the transofrmations
11+
applied should be constrained in some sense. In the end, it can be argued, that data anonymization is an optimization problem
12+
meaning striking the right balance between data utility and privacy.
13+
714
Reinforcement learning is a learning framework based on accumulated experience. In this paradigm, an agent is learning by iteracting with an environment
8-
without (to a large extent) any supervision. The following image schematically describes the reinforcement learning framework
15+
without (to a large extent) any supervision. The following image describes, schematically, the reinforcement learning framework .
916

1017
![RL paradigm](images/agent_environment_interface.png "Reinforcement learning paradigm")
1118

12-
The framework has been use successfully to many recent advances in corntol, robotics, games and elsewhere.
19+
The agent chooses an action, ```a_t```, to perform out of predefined set of actions ```A```. The chosen action is executed by the environment
20+
instance and returns to the agent a reward signal, ```r_t```, as well as the new state, ```s_t```, that the enviroment is in.
21+
The framework has successfully been used to many recent advances in control, robotics, games and elsewhere.
22+
1323

14-
Given that data anonymity is essentially an optimization problem; between data utility and privacy, in this repository we try
15-
to use the reinforcement learning paradigm in order to train agents to perform this optimization for us. The following image
16-
places this into a persepctive
24+
Let's assume that we have in our disposal two numbers a minimum distortion, ```MIN_DIST``` that should be applied to the dataset
25+
for achieving privacy and a maximum distortion, ```MAX_DIST```, that should be applied to the dataset in order to maintain some utility.
26+
Let's assume also that any overall dataset distortion in ```[MIN_DIST, MAX_DIST]``` is acceptable in order to cast the dataset as
27+
preserving privacy and preserving dataset utility. We can then train a reinforcement learning agent to distort the dataset
28+
such that the aforementioned objective is achieved.
1729

30+
Overall, this is shown in the image below.
1831

1932
![RL anonymity paradigm](images/general_concept.png "Reinforcement learning anonymity schematics")
2033

34+
The images below show the overall running distortion average and running reward average achieved by using the
35+
<a href="https://en.wikipedia.org/wiki/Q-learning">Q-learning</a> algorithm and various policies.
36+
37+
**Q-learning with epsilon-greedy policy and constant epsilon**
38+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_avg_run_distortion.png "Epsilon-greedy constant epsilon ")
39+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_avg_run_reward.png "Reinforcement learning anonymity schematics")
40+
41+
**Q-learning with epsilon-greedy policy and decaying epsilon per episode**
42+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_decay_avg_run_distortion.png "Reinforcement learning anonymity schematics")
43+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_decay_avg_run_reward.png "Reinforcement learning anonymity schematics")
44+
45+
46+
**Q-learning with epsilon-greedy policy with decaying epsilon at constant rate**
47+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_decay_rate_avg_run_distortion.png "Reinforcement learning anonymity schematics")
48+
![RL anonymity paradigm](images/q_learn_epsilon_greedy_decay_rate_avg_run_reward.png "Reinforcement learning anonymity schematics")
49+
50+
**Q-learning with softmax policy running average distorion**
51+
![RL anonymity paradigm](images/q_learn_softmax_avg_run_distortion.png "Reinforcement learning anonymity schematics")
52+
![RL anonymity paradigm](images/q_learn_softmax_avg_run_reward.png "Reinforcement learning anonymity schematics")
53+
54+
2155
## Dependencies
2256

57+
- NumPy
58+
2359
## Documentation
2460

61+
## References
62+

0 commit comments

Comments
 (0)