Markopy
Utilizing Markov Models for brute forcing attacks
Markov::API::ModelMatrix Class Reference

Class to flatten and reduce Markov::Model to a Matrix. More...

#include <modelMatrix.h>

Inheritance diagram for Markov::API::ModelMatrix:
Collaboration diagram for Markov::API::ModelMatrix:

Public Member Functions

 ModelMatrix ()
 
bool ConstructMatrix ()
 Construct the related Matrix data for the model. More...
 
void DumpJSON ()
 Debug function to dump the model to a JSON file. More...
 
int FastRandomWalk (unsigned long int n, const char *wordlistFileName, int minLen=6, int maxLen=12, int threads=20, bool bFileIO=true)
 Random walk on the Matrix-reduced Markov::Model. More...
 
void Import (const char *filename)
 Open a file to import with filename, and call bool Model::Import with std::ifstream. More...
 
void Train (const char *datasetFileName, char delimiter, int threads)
 Train the model with the dataset file. More...
 
std::ifstream * OpenDatasetFile (const char *filename)
 Open dataset file and return the ifstream pointer. More...
 
std::ofstream * Save (const char *filename)
 Export model to file. More...
 
void Generate (unsigned long int n, const char *wordlistFileName, int minLen=6, int maxLen=12, int threads=20)
 Call Markov::Model::RandomWalk n times, and collect output. More...
 
void Buff (const char *str, double multiplier, bool bDontAdjustSelfLoops=true, bool bDontAdjustExtendedLoops=false)
 Buff expression of some characters in the model. More...
 
char * RandomWalk (Markov::Random::RandomEngine *randomEngine, int minSetting, int maxSetting, char *buffer)
 Do a random walk on this model. More...
 
void AdjustEdge (const char *payload, long int occurrence)
 Adjust the model with a single string. More...
 
bool Import (std::ifstream *)
 Import a file to construct the model. More...
 
bool Export (std::ofstream *)
 Export a file of the model. More...
 
bool Export (const char *filename)
 Open a file to export with filename, and call bool Model::Export with std::ofstream. More...
 
Node< char > * StarterNode ()
 Return starter Node. More...
 
std::vector< Edge< char > * > * Edges ()
 Return a vector of all the edges in the model. More...
 
std::map< char, Node< char > * > * Nodes ()
 Return starter Node. More...
 
void OptimizeEdgeOrder ()
 Sort edges of all nodes in the model ordered by edge weights. More...
 

Protected Member Functions

int FastRandomWalk (unsigned long int n, std::ofstream *wordlist, int minLen=6, int maxLen=12, int threads=20, bool bFileIO=true)
 Random walk on the Matrix-reduced Markov::Model. More...
 
void FastRandomWalkPartition (std::mutex *mlock, std::ofstream *wordlist, unsigned long int n, int minLen, int maxLen, bool bFileIO, int threads)
 A single partition of FastRandomWalk event. More...
 
void FastRandomWalkThread (std::mutex *mlock, std::ofstream *wordlist, unsigned long int n, int minLen, int maxLen, int id, bool bFileIO)
 A single thread of a single partition of FastRandomWalk. More...
 
bool DeallocateMatrix ()
 Deallocate matrix and make it ready for re-construction. More...
 

Protected Attributes

char ** edgeMatrix
 2-D Character array for the edge Matrix (The characters of Nodes) More...
 
long int ** valueMatrix
 2-d Integer array for the value Matrix (For the weights of Edges) More...
 
int matrixSize
 to hold Matrix size More...
 
char * matrixIndex
 to hold the Matrix index (To hold the orders of 2-D arrays') More...
 
long int * totalEdgeWeights
 Array of the Total Edge Weights. More...
 
bool ready
 True when matrix is constructed. False if not. More...
 

Private Member Functions

void TrainThread (Markov::API::Concurrency::ThreadSharedListHandler *listhandler, char delimiter)
 A single thread invoked by the Train function. More...
 
void GenerateThread (std::mutex *outputLock, unsigned long int n, std::ofstream *wordlist, int minLen, int maxLen)
 A single thread invoked by the Generate function. More...
 

Private Attributes

std::ifstream * datasetFile
 
std::ofstream * modelSavefile
 Dataset file input of our system
More...
 
std::ofstream * outputFile
 File to save model of our system
More...
 
std::map< char, Node< char > * > nodes
 Map LeftNode is the Nodes NodeValue Map RightNode is the node pointer. More...
 
Node< char > * starterNode
 Starter Node of this model. More...
 
std::vector< Edge< char > * > edges
 A list of all edges in this model. More...
 

Detailed Description

Class to flatten and reduce Markov::Model to a Matrix.

Matrix level operations can be used for Generation events, with a significant performance optimization at the cost of O(N) memory complexity (O(1) memory space for slow mode)

To limit the maximum memory usage, each generation operation is partitioned into 50M chunks for allocation. Threads are sychronized and files are flushed every 50M operations.

Definition at line 23 of file modelMatrix.h.

Constructor & Destructor Documentation

◆ ModelMatrix()

Markov::API::ModelMatrix::ModelMatrix ( )

Definition at line 15 of file modelMatrix.cpp.

15  {
16  this->ready = false;
17 }
bool ready
True when matrix is constructed. False if not.
Definition: modelMatrix.h:200

References ready.

Member Function Documentation

◆ AdjustEdge()

void Markov::Model< char >::AdjustEdge ( const NodeStorageType payload,
long int  occurrence 
)
inherited

Adjust the model with a single string.

Start from the starter node, and for each character, AdjustEdge the edge EdgeWeight from current node to the next, until NULL character is reached.

Then, update the edge EdgeWeight from current node, to the terminator node.

This function is used for training purposes, as it can be used for adjusting the model with each line of the corpus file.

Example Use: Create an empty model and train it with string: "testdata"

char test[] = "testdata";
model.AdjustEdge(test, 15);
void AdjustEdge(const NodeStorageType *payload, long int occurrence)
Adjust the model with a single string.
Definition: model.h:337
Parameters
string- String that is passed from the training, and will be used to AdjustEdge the model with
occurrence- Occurrence of this string.

Definition at line 109 of file model.h.

337  {
338  NodeStorageType p = payload[0];
341  int i = 0;
342 
343  if (p == 0) return;
344  while (p != 0) {
345  e = curnode->FindEdge(p);
346  if (e == NULL) return;
347  e->AdjustEdge(occurrence);
348  curnode = e->RightNode();
349  p = payload[++i];
350  }
351 
352  e = curnode->FindEdge('\xff');
353  e->AdjustEdge(occurrence);
354  return;
355 }
Edge class used to link nodes in the model together.
Definition: edge.h:23
Node< NodeStorageType > * RightNode()
return edge's RightNode
Definition: edge.h:170
void AdjustEdge(long int offset)
Adjust the edge EdgeWeight with offset. Adds the offset parameter to the edge EdgeWeight.
Definition: edge.h:137
Node< char > * starterNode
Starter Node of this model.
Definition: model.h:198
Edge< storageType > * FindEdge(storageType repr)
Find an edge with its character representation.
Definition: node.h:260

◆ Buff()

void Markov::API::MarkovPasswords::Buff ( const char *  str,
double  multiplier,
bool  bDontAdjustSelfLoops = true,
bool  bDontAdjustExtendedLoops = false 
)
inherited

Buff expression of some characters in the model.

Parameters
strA string containing all the characters to be buffed
multiplierA constant value to buff the nodes with.
bDontAdjustSelfEdgesDo not adjust weights if target node is same as source node
bDontAdjustExtendedLoopsDo not adjust if both source and target nodes are in first parameter

Definition at line 153 of file markovPasswords.cpp.

153  {
154  std::string buffstr(str);
155  std::map< char, Node< char > * > *nodes;
156  std::map< char, Edge< char > * > *edges;
157  nodes = this->Nodes();
158  int i=0;
159  for (auto const& [repr, node] : *nodes){
160  edges = node->Edges();
161  for (auto const& [targetrepr, edge] : *edges){
162  if(buffstr.find(targetrepr)!= std::string::npos){
163  if(bDontAdjustSelfLoops && repr==targetrepr) continue;
164  if(bDontAdjustExtendedLoops){
165  if(buffstr.find(repr)!= std::string::npos){
166  continue;
167  }
168  }
169  long int weight = edge->EdgeWeight();
170  weight = weight*multiplier;
171  edge->AdjustEdge(weight);
172  }
173 
174  }
175  i++;
176  }
177 
178  this->OptimizeEdgeOrder();
179 }
std::vector< Edge< char > * > edges
A list of all edges in this model.
Definition: model.h:204
std::map< char, Node< char > * > * Nodes()
Return starter Node.
Definition: model.h:181
std::map< char, Node< char > * > nodes
Map LeftNode is the Nodes NodeValue Map RightNode is the node pointer.
Definition: model.h:193
void OptimizeEdgeOrder()
Sort edges of all nodes in the model ordered by edge weights.
Definition: model.h:265

References Markov::Edge< NodeStorageType >::AdjustEdge(), Markov::Node< storageType >::Edges(), Markov::Edge< NodeStorageType >::EdgeWeight(), Markov::Model< NodeStorageType >::Nodes(), and Markov::Model< NodeStorageType >::OptimizeEdgeOrder().

Referenced by main().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ ConstructMatrix()

bool Markov::API::ModelMatrix::ConstructMatrix ( )

Construct the related Matrix data for the model.

This operation can be used after importing/training to allocate and populate the matrix content.

this will initialize: char** edgeMatrix -> a 2D array of mapping left and right connections of each edge. long int **valueMatrix -> a 2D array representing the edge weights. int matrixSize -> Size of the matrix, aka total number of nodes. char* matrixIndex -> order of nodes in the model long int *totalEdgeWeights -> total edge weights of each Node.

Returns
True if constructed. False if already construced.

Definition at line 31 of file modelMatrix.cpp.

31  {
32  if(this->ready) return false;
33  this->matrixSize = this->StarterNode()->edgesV.size() + 2;
34 
35  this->matrixIndex = new char[this->matrixSize];
36  this->totalEdgeWeights = new long int[this->matrixSize];
37 
38  this->edgeMatrix = new char*[this->matrixSize];
39  for(int i=0;i<this->matrixSize;i++){
40  this->edgeMatrix[i] = new char[this->matrixSize];
41  }
42  this->valueMatrix = new long int*[this->matrixSize];
43  for(int i=0;i<this->matrixSize;i++){
44  this->valueMatrix[i] = new long int[this->matrixSize];
45  }
46  std::map< char, Node< char > * > *nodes;
47  nodes = this->Nodes();
48  int i=0;
49  for (auto const& [repr, node] : *nodes){
50  if(repr!=0) this->matrixIndex[i] = repr;
51  else this->matrixIndex[i] = 199;
52  this->totalEdgeWeights[i] = node->TotalEdgeWeights();
53  for(int j=0;j<this->matrixSize;j++){
54  char val = node->NodeValue();
55  if(val < 0){
56  for(int k=0;k<this->matrixSize;k++){
57  this->valueMatrix[i][k] = 0;
58  this->edgeMatrix[i][k] = 255;
59  }
60  break;
61  }
62  else if(node->NodeValue() == 0 && j>(this->matrixSize-3)){
63  this->valueMatrix[i][j] = 0;
64  this->edgeMatrix[i][j] = 255;
65  }else if(j==(this->matrixSize-1)) {
66  this->valueMatrix[i][j] = 0;
67  this->edgeMatrix[i][j] = 255;
68  }else{
69  this->valueMatrix[i][j] = node->edgesV[j]->EdgeWeight();
70  this->edgeMatrix[i][j] = node->edgesV[j]->RightNode()->NodeValue();
71  }
72 
73  }
74  i++;
75  }
76  this->ready = true;
77  return true;
78  //this->DumpJSON();
79 }
long int ** valueMatrix
2-d Integer array for the value Matrix (For the weights of Edges)
Definition: modelMatrix.h:180
int matrixSize
to hold Matrix size
Definition: modelMatrix.h:185
long int * totalEdgeWeights
Array of the Total Edge Weights.
Definition: modelMatrix.h:195
char ** edgeMatrix
2-D Character array for the edge Matrix (The characters of Nodes)
Definition: modelMatrix.h:175
char * matrixIndex
to hold the Matrix index (To hold the orders of 2-D arrays')
Definition: modelMatrix.h:190
Node< char > * StarterNode()
Return starter Node.
Definition: model.h:171
std::vector< Edge< storageType > * > edgesV
Definition: node.h:173

References edgeMatrix, Markov::Edge< NodeStorageType >::EdgeWeight(), matrixIndex, matrixSize, Markov::Model< NodeStorageType >::Nodes(), Markov::Node< storageType >::NodeValue(), ready, Markov::Edge< NodeStorageType >::RightNode(), Markov::Model< NodeStorageType >::StarterNode(), totalEdgeWeights, Markov::Node< storageType >::TotalEdgeWeights(), and valueMatrix.

Referenced by Markov::Markopy::CUDA::BOOST_PYTHON_MODULE(), Markov::Markopy::BOOST_PYTHON_MODULE(), Import(), and Train().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ DeallocateMatrix()

bool Markov::API::ModelMatrix::DeallocateMatrix ( )
protected

Deallocate matrix and make it ready for re-construction.

Returns
True if deallocated. False if matrix was not initialized

Definition at line 81 of file modelMatrix.cpp.

81  {
82  if(!this->ready) return false;
83  delete[] this->matrixIndex;
84  delete[] this->totalEdgeWeights;
85 
86  for(int i=0;i<this->matrixSize;i++){
87  delete[] this->edgeMatrix[i];
88  }
89  delete[] this->edgeMatrix;
90 
91  for(int i=0;i<this->matrixSize;i++){
92  delete[] this->valueMatrix[i];
93  }
94  delete[] this->valueMatrix;
95 
96  this->matrixSize = -1;
97  this->ready = false;
98  return true;
99 }

References edgeMatrix, matrixIndex, matrixSize, ready, totalEdgeWeights, and valueMatrix.

Referenced by Import(), and Train().

Here is the caller graph for this function:

◆ DumpJSON()

void Markov::API::ModelMatrix::DumpJSON ( )

Debug function to dump the model to a JSON file.

Might not work 100%. Not meant for production use.

Definition at line 101 of file modelMatrix.cpp.

101  {
102 
103  std::cout << "{\n \"index\": \"";
104  for(int i=0;i<this->matrixSize;i++){
105  if(this->matrixIndex[i]=='"') std::cout << "\\\"";
106  else if(this->matrixIndex[i]=='\\') std::cout << "\\\\";
107  else if(this->matrixIndex[i]==0) std::cout << "\\\\x00";
108  else if(i==0) std::cout << "\\\\xff";
109  else if(this->matrixIndex[i]=='\n') std::cout << "\\n";
110  else std::cout << this->matrixIndex[i];
111  }
112  std::cout <<
113  "\",\n"
114  " \"edgemap\": {\n";
115 
116  for(int i=0;i<this->matrixSize;i++){
117  if(this->matrixIndex[i]=='"') std::cout << " \"\\\"\": [";
118  else if(this->matrixIndex[i]=='\\') std::cout << " \"\\\\\": [";
119  else if(this->matrixIndex[i]==0) std::cout << " \"\\\\x00\": [";
120  else if(this->matrixIndex[i]<0) std::cout << " \"\\\\xff\": [";
121  else std::cout << " \"" << this->matrixIndex[i] << "\": [";
122  for(int j=0;j<this->matrixSize;j++){
123  if(this->edgeMatrix[i][j]=='"') std::cout << "\"\\\"\"";
124  else if(this->edgeMatrix[i][j]=='\\') std::cout << "\"\\\\\"";
125  else if(this->edgeMatrix[i][j]==0) std::cout << "\"\\\\x00\"";
126  else if(this->edgeMatrix[i][j]<0) std::cout << "\"\\\\xff\"";
127  else if(this->matrixIndex[i]=='\n') std::cout << "\"\\n\"";
128  else std::cout << "\"" << this->edgeMatrix[i][j] << "\"";
129  if(j!=this->matrixSize-1) std::cout << ", ";
130  }
131  std::cout << "],\n";
132  }
133  std::cout << "},\n";
134 
135  std::cout << "\" weightmap\": {\n";
136  for(int i=0;i<this->matrixSize;i++){
137  if(this->matrixIndex[i]=='"') std::cout << " \"\\\"\": [";
138  else if(this->matrixIndex[i]=='\\') std::cout << " \"\\\\\": [";
139  else if(this->matrixIndex[i]==0) std::cout << " \"\\\\x00\": [";
140  else if(this->matrixIndex[i]<0) std::cout << " \"\\\\xff\": [";
141  else std::cout << " \"" << this->matrixIndex[i] << "\": [";
142 
143  for(int j=0;j<this->matrixSize;j++){
144  std::cout << this->valueMatrix[i][j];
145  if(j!=this->matrixSize-1) std::cout << ", ";
146  }
147  std::cout << "],\n";
148  }
149  std::cout << " }\n}\n";
150 }

References edgeMatrix, matrixIndex, matrixSize, and valueMatrix.

Referenced by Markov::Markopy::CUDA::BOOST_PYTHON_MODULE(), and Markov::Markopy::BOOST_PYTHON_MODULE().

Here is the caller graph for this function:

◆ Edges()

std::vector<Edge<char >*>* Markov::Model< char >::Edges ( )
inlineinherited

Return a vector of all the edges in the model.

Returns
vector of edges

Definition at line 176 of file model.h.

176 { return &edges;}

◆ Export() [1/2]

bool Markov::Model< char >::Export ( const char *  filename)
inherited

Open a file to export with filename, and call bool Model::Export with std::ofstream.

Returns
True if successful, False for incomplete models or corrupt file formats

Example Use: Export file to filename

model.Export("test.mdl");
bool Export(std::ofstream *)
Export a file of the model.
Definition: model.h:288

Definition at line 166 of file model.h.

300  {
301  std::ofstream exportfile;
302  exportfile.open(filename);
303  return this->Export(&exportfile);
304 }

◆ Export() [2/2]

bool Markov::Model< char >::Export ( std::ofstream *  f)
inherited

Export a file of the model.

File contains a list of edges. Format is: Left_repr;EdgeWeight;right_repr. For more information on the format, check out the project wiki or github readme.

Iterate over this vertices, and their edges, and write them to file.

Returns
True if successful, False for incomplete models.

Example Use: Export file to ofstream

std::ofstream file("test.mdl");
model.Export(&file);

Definition at line 155 of file model.h.

288  {
290  for (std::vector<int>::size_type i = 0; i != this->edges.size(); i++) {
291  e = this->edges[i];
292  //std::cout << e->LeftNode()->NodeValue() << "," << e->EdgeWeight() << "," << e->RightNode()->NodeValue() << "\n";
293  *f << e->LeftNode()->NodeValue() << "," << e->EdgeWeight() << "," << e->RightNode()->NodeValue() << "\n";
294  }
295 
296  return true;
297 }
uint64_t EdgeWeight()
return edge's EdgeWeight.
Definition: edge.h:160
Node< NodeStorageType > * LeftNode()
return edge's LeftNode
Definition: edge.h:165
unsigned char NodeValue()
Return character representation of this node.
Definition: node.h:215
f
output file handle
Definition: model_2gram.py:16

◆ FastRandomWalk() [1/2]

int Markov::API::ModelMatrix::FastRandomWalk ( unsigned long int  n,
const char *  wordlistFileName,
int  minLen = 6,
int  maxLen = 12,
int  threads = 20,
bool  bFileIO = true 
)

Random walk on the Matrix-reduced Markov::Model.

This has an O(N) Memory complexity. To limit the maximum usage, requests with n>50M are partitioned using Markov::API::ModelMatrix::FastRandomWalkPartition.

If n>50M, threads are going to be synced, files are going to be flushed, and buffers will be reallocated every 50M generations. This comes at a minor performance penalty.

While it has the same functionality, this operation reduces Markov::API::MarkovPasswords::Generate runtime by %96.5

This function has deprecated Markov::API::MarkovPasswords::Generate, and will eventually replace it.

Parameters
n- Number of passwords to generate.
wordlistFileName- Filename to write to
minLen- Minimum password length to generate
maxLen- Maximum password length to generate
threads- number of OS threads to spawn
bFileIO- If false, filename will be ignored and will output to stdout.
mp.Import("models/finished.mdl");
mp.FastRandomWalk(50000000,"./wordlist.txt",6,12,25, true);
Class to flatten and reduce Markov::Model to a Matrix.
Definition: modelMatrix.h:23
Definition: mp.py:1

Definition at line 217 of file modelMatrix.cpp.

217  {
218  std::ofstream wordlist;
219  if(bFileIO)
220  wordlist.open(wordlistFileName);
221  this->FastRandomWalk(n, &wordlist, minLen, maxLen, threads, bFileIO);
222  return 0;
223 }
int FastRandomWalk(unsigned long int n, const char *wordlistFileName, int minLen=6, int maxLen=12, int threads=20, bool bFileIO=true)
Random walk on the Matrix-reduced Markov::Model.

References FastRandomWalk().

Referenced by Markov::Markopy::BOOST_PYTHON_MODULE().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ FastRandomWalk() [2/2]

int Markov::API::ModelMatrix::FastRandomWalk ( unsigned long int  n,
std::ofstream *  wordlist,
int  minLen = 6,
int  maxLen = 12,
int  threads = 20,
bool  bFileIO = true 
)
protected

Random walk on the Matrix-reduced Markov::Model.

This has an O(N) Memory complexity. To limit the maximum usage, requests with n>50M are partitioned using Markov::API::ModelMatrix::FastRandomWalkPartition.

If n>50M, threads are going to be synced, files are going to be flushed, and buffers will be reallocated every 50M generations. This comes at a minor performance penalty.

While it has the same functionality, this operation reduces Markov::API::MarkovPasswords::Generate runtime by %96.5

This function has deprecated Markov::API::MarkovPasswords::Generate, and will eventually replace it.

Parameters
n- Number of passwords to generate.
wordlistFileName- Filename to write to
minLen- Minimum password length to generate
maxLen- Maximum password length to generate
threads- number of OS threads to spawn
bFileIO- If false, filename will be ignored and will output to stdout.
mp.Import("models/finished.mdl");
mp.FastRandomWalk(50000000,"./wordlist.txt",6,12,25, true);

Definition at line 204 of file modelMatrix.cpp.

204  {
205 
206 
207  std::mutex mlock;
208  if(n<=50000000ull) this->FastRandomWalkPartition(&mlock, wordlist, n, minLen, maxLen, bFileIO, threads);
209  else{
210  int numberOfPartitions = n/50000000ull;
211  for(int i=0;i<numberOfPartitions;i++)
212  this->FastRandomWalkPartition(&mlock, wordlist, 50000000ull, minLen, maxLen, bFileIO, threads);
213  }
214  return 0;
215 }
void FastRandomWalkPartition(std::mutex *mlock, std::ofstream *wordlist, unsigned long int n, int minLen, int maxLen, bool bFileIO, int threads)
A single partition of FastRandomWalk event.

References FastRandomWalkPartition().

Referenced by FastRandomWalk().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ FastRandomWalkPartition()

void Markov::API::ModelMatrix::FastRandomWalkPartition ( std::mutex *  mlock,
std::ofstream *  wordlist,
unsigned long int  n,
int  minLen,
int  maxLen,
bool  bFileIO,
int  threads 
)
protected

A single partition of FastRandomWalk event.

Since FastRandomWalk has to allocate its output buffer before operation starts and writes data in chunks, large n parameters would lead to huge memory allocations. Without Partitioning:

  • 50M results 12 characters max -> 550 Mb Memory allocation
  • 5B results 12 characters max -> 55 Gb Memory allocation
  • 50B results 12 characters max -> 550GB Memory allocation

Instead, FastRandomWalk is partitioned per 50M generations to limit the top memory need.

Parameters
mlock- mutex lock to distribute to child threads
wordlist- Reference to the wordlist file to write to
n- Number of passwords to generate.
wordlistFileName- Filename to write to
minLen- Minimum password length to generate
maxLen- Maximum password length to generate
threads- number of OS threads to spawn
bFileIO- If false, filename will be ignored and will output to stdout.

Definition at line 225 of file modelMatrix.cpp.

225  {
226 
227  int iterationsPerThread = n/threads;
228  int iterationsPerThreadCarryOver = n%threads;
229 
230  std::vector<std::thread*> threadsV;
231 
232  int id = 0;
233  for(int i=0;i<threads;i++){
234  threadsV.push_back(new std::thread(&Markov::API::ModelMatrix::FastRandomWalkThread, this, mlock, wordlist, iterationsPerThread, minLen, maxLen, id, bFileIO));
235  id++;
236  }
237 
238  threadsV.push_back(new std::thread(&Markov::API::ModelMatrix::FastRandomWalkThread, this, mlock, wordlist, iterationsPerThreadCarryOver, minLen, maxLen, id, bFileIO));
239 
240  for(int i=0;i<threads;i++){
241  threadsV[i]->join();
242  }
243 }
void FastRandomWalkThread(std::mutex *mlock, std::ofstream *wordlist, unsigned long int n, int minLen, int maxLen, int id, bool bFileIO)
A single thread of a single partition of FastRandomWalk.

References FastRandomWalkThread().

Referenced by FastRandomWalk().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ FastRandomWalkThread()

void Markov::API::ModelMatrix::FastRandomWalkThread ( std::mutex *  mlock,
std::ofstream *  wordlist,
unsigned long int  n,
int  minLen,
int  maxLen,
int  id,
bool  bFileIO 
)
protected

A single thread of a single partition of FastRandomWalk.

A FastRandomWalkPartition will initiate as many of this function as requested.

This function contains the bulk of the generation algorithm.

Parameters
mlock- mutex lock to distribute to child threads
wordlist- Reference to the wordlist file to write to
n- Number of passwords to generate.
wordlistFileName- Filename to write to
minLen- Minimum password length to generate
maxLen- Maximum password length to generate
id- DEPRECATED Thread id - No longer used
bFileIO- If false, filename will be ignored and will output to stdout.

Definition at line 153 of file modelMatrix.cpp.

153  {
154  if(n==0) return;
155 
156  Markov::Random::Marsaglia MarsagliaRandomEngine;
157  char* e;
158  char *res = new char[(maxLen+2)*n];
159  int index = 0;
160  char next;
161  int len=0;
162  long int selection;
163  char cur;
164  long int bufferctr = 0;
165  for (int i = 0; i < n; i++) {
166  cur=199;
167  len=0;
168  while (true) {
169  e = strchr(this->matrixIndex, cur);
170  index = e - this->matrixIndex;
171  selection = MarsagliaRandomEngine.random() % this->totalEdgeWeights[index];
172  for(int j=0;j<this->matrixSize;j++){
173  selection -= this->valueMatrix[index][j];
174  if (selection < 0){
175  next = this->edgeMatrix[index][j];
176  break;
177  }
178  }
179 
180  if (len >= maxLen) break;
181  else if ((next < 0) && (len < minLen)) continue;
182  else if (next < 0) break;
183  cur = next;
184  res[bufferctr + len++] = cur;
185  }
186  res[bufferctr + len++] = '\n';
187  bufferctr+=len;
188 
189  }
190  if(bFileIO){
191  mlock->lock();
192  *wordlist << res;
193  mlock->unlock();
194  }else{
195  mlock->lock();
196  std::cout << res;
197  mlock->unlock();
198  }
199  delete res;
200 
201 }
Implementation of Marsaglia Random Engine.
Definition: random.h:125
unsigned long random()
Generate Random Number.
Definition: random.h:140
__device__ char * strchr(char *p, char c, int s_len)
srtchr implementation on device space

References edgeMatrix, matrixIndex, matrixSize, Markov::Random::Marsaglia::random(), totalEdgeWeights, and valueMatrix.

Referenced by FastRandomWalkPartition().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ Generate()

void Markov::API::MarkovPasswords::Generate ( unsigned long int  n,
const char *  wordlistFileName,
int  minLen = 6,
int  maxLen = 12,
int  threads = 20 
)
inherited

Call Markov::Model::RandomWalk n times, and collect output.

Generate from model and write results to a file. a much more performance-optimized method. FastRandomWalk will reduce the runtime by %96.5 on average.

Deprecated:
See Markov::API::MatrixModel::FastRandomWalk for more information.
Parameters
n- Number of passwords to generate.
wordlistFileName- Filename to write to
minLen- Minimum password length to generate
maxLen- Maximum password length to generate
threads- number of OS threads to spawn

Definition at line 118 of file markovPasswords.cpp.

118  {
119  char* res;
120  char print[100];
121  std::ofstream wordlist;
122  wordlist.open(wordlistFileName);
123  std::mutex mlock;
124  int iterationsPerThread = n/threads;
125  int iterationsCarryOver = n%threads;
126  std::vector<std::thread*> threadsV;
127  for(int i=0;i<threads;i++){
128  threadsV.push_back(new std::thread(&Markov::API::MarkovPasswords::GenerateThread, this, &mlock, iterationsPerThread, &wordlist, minLen, maxLen));
129  }
130 
131  for(int i=0;i<threads;i++){
132  threadsV[i]->join();
133  delete threadsV[i];
134  }
135 
136  this->GenerateThread(&mlock, iterationsCarryOver, &wordlist, minLen, maxLen);
137 
138 }
void GenerateThread(std::mutex *outputLock, unsigned long int n, std::ofstream *wordlist, int minLen, int maxLen)
A single thread invoked by the Generate function.

References Markov::API::MarkovPasswords::GenerateThread().

Referenced by Markov::Markopy::BOOST_PYTHON_MODULE(), and Markov::GUI::Generate::generation().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ GenerateThread()

void Markov::API::MarkovPasswords::GenerateThread ( std::mutex *  outputLock,
unsigned long int  n,
std::ofstream *  wordlist,
int  minLen,
int  maxLen 
)
privateinherited

A single thread invoked by the Generate function.

DEPRECATED: See Markov::API::MatrixModel::FastRandomWalkThread for more information. This has been replaced with a much more performance-optimized method. FastRandomWalk will reduce the runtime by %96.5 on average.

Parameters
outputLock- shared mutex lock to lock during output operation. Prevents race condition on write.
nnumber of lines to be generated by this thread
wordlistwordlistfile
minLen- Minimum password length to generate
maxLen- Maximum password length to generate

Definition at line 140 of file markovPasswords.cpp.

140  {
141  char* res = new char[maxLen+5];
142  if(n==0) return;
143 
144  Markov::Random::Marsaglia MarsagliaRandomEngine;
145  for (int i = 0; i < n; i++) {
146  this->RandomWalk(&MarsagliaRandomEngine, minLen, maxLen, res);
147  outputLock->lock();
148  *wordlist << res << "\n";
149  outputLock->unlock();
150  }
151 }
char * RandomWalk(Markov::Random::RandomEngine *randomEngine, int minSetting, int maxSetting, char *buffer)
Do a random walk on this model.
Definition: model.h:307

References Markov::Model< NodeStorageType >::RandomWalk().

Referenced by Markov::API::MarkovPasswords::Generate().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ Import() [1/2]

void Markov::API::ModelMatrix::Import ( const char *  filename)

Open a file to import with filename, and call bool Model::Import with std::ifstream.

Returns
True if successful, False for incomplete models or corrupt file formats

Example Use: Import a file with filename

model.Import("test.mdl");
bool Import(std::ifstream *)
Import a file to construct the model.
Definition: model.h:216

Construct the matrix when done.

Definition at line 19 of file modelMatrix.cpp.

19  {
20  this->DeallocateMatrix();
22  this->ConstructMatrix();
23 }
bool DeallocateMatrix()
Deallocate matrix and make it ready for re-construction.
Definition: modelMatrix.cpp:81
bool ConstructMatrix()
Construct the related Matrix data for the model.
Definition: modelMatrix.cpp:31

References ConstructMatrix(), DeallocateMatrix(), and Markov::Model< NodeStorageType >::Import().

Referenced by Markov::Markopy::CUDA::BOOST_PYTHON_MODULE(), Markov::Markopy::BOOST_PYTHON_MODULE(), and main().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ Import() [2/2]

bool Markov::Model< char >::Import ( std::ifstream *  f)
inherited

Import a file to construct the model.

File contains a list of edges. For more info on the file format, check out the wiki and github readme pages. Format is: Left_repr;EdgeWeight;right_repr

Iterate over this list, and construct nodes and edges accordingly.

Returns
True if successful, False for incomplete models or corrupt file formats

Example Use: Import a file from ifstream

std::ifstream file("test.mdl");
model.Import(&file);

Definition at line 126 of file model.h.

216  {
217  std::string cell;
218 
219  char src;
220  char target;
221  long int oc;
222 
223  while (std::getline(*f, cell)) {
224  //std::cout << "cell: " << cell << std::endl;
225  src = cell[0];
226  target = cell[cell.length() - 1];
227  char* j;
228  oc = std::strtol(cell.substr(2, cell.length() - 2).c_str(),&j,10);
229  //std::cout << oc << "\n";
233  if (this->nodes.find(src) == this->nodes.end()) {
234  srcN = new Markov::Node<NodeStorageType>(src);
235  this->nodes.insert(std::pair<char, Markov::Node<NodeStorageType>*>(src, srcN));
236  //std::cout << "Creating new node at start.\n";
237  }
238  else {
239  srcN = this->nodes.find(src)->second;
240  }
241 
242  if (this->nodes.find(target) == this->nodes.end()) {
243  targetN = new Markov::Node<NodeStorageType>(target);
244  this->nodes.insert(std::pair<char, Markov::Node<NodeStorageType>*>(target, targetN));
245  //std::cout << "Creating new node at end.\n";
246  }
247  else {
248  targetN = this->nodes.find(target)->second;
249  }
250  e = srcN->Link(targetN);
251  e->AdjustEdge(oc);
252  this->edges.push_back(e);
253 
254  //std::cout << int(srcN->NodeValue()) << " --" << e->EdgeWeight() << "--> " << int(targetN->NodeValue()) << "\n";
255 
256 
257  }
258 
259  this->OptimizeEdgeOrder();
260 
261  return true;
262 }
Edge< storageType > * Link(Node< storageType > *)
Link this node with another, with this node as its source.
Definition: node.h:220

◆ Nodes()

std::map<char , Node<char >*>* Markov::Model< char >::Nodes ( )
inlineinherited

Return starter Node.

Returns
starter node with 00 NodeValue

Definition at line 181 of file model.h.

181 { return &nodes;}

◆ OpenDatasetFile()

std::ifstream * Markov::API::MarkovPasswords::OpenDatasetFile ( const char *  filename)
inherited

Open dataset file and return the ifstream pointer.

Parameters
filename- Filename to open
Returns
ifstream* to the the dataset file

Definition at line 51 of file markovPasswords.cpp.

51  {
52 
53  std::ifstream* datasetFile;
54 
55  std::ifstream newFile(filename);
56 
57  datasetFile = &newFile;
58 
59  this->Import(datasetFile);
60  return datasetFile;
61 }

References Markov::Model< NodeStorageType >::Import().

Here is the call graph for this function:

◆ OptimizeEdgeOrder()

void Markov::Model< char >::OptimizeEdgeOrder
inherited

Sort edges of all nodes in the model ordered by edge weights.

Definition at line 186 of file model.h.

265  {
266  for (std::pair<unsigned char, Markov::Node<NodeStorageType>*> const& x : this->nodes) {
267  //std::cout << "Total edges in EdgesV: " << x.second->edgesV.size() << "\n";
268  std::sort (x.second->edgesV.begin(), x.second->edgesV.end(), [](Edge<NodeStorageType> *lhs, Edge<NodeStorageType> *rhs)->bool{
269  return lhs->EdgeWeight() > rhs->EdgeWeight();
270  });
271  //for(int i=0;i<x.second->edgesV.size();i++)
272  // std::cout << x.second->edgesV[i]->EdgeWeight() << ", ";
273  //std::cout << "\n";
274  }
275  //std::cout << "Total number of nodes: " << this->nodes.size() << std::endl;
276  //std::cout << "Total number of edges: " << this->edges.size() << std::endl;
277 }

◆ RandomWalk()

char * Markov::Model< char >::RandomWalk ( Markov::Random::RandomEngine randomEngine,
int  minSetting,
int  maxSetting,
NodeStorageType buffer 
)
inherited

Do a random walk on this model.

Start from the starter node, on each node, invoke RandomNext using the random engine on current node, until terminator node is reached. If terminator node is reached before minimum length criateria is reached, ignore the last selection and re-invoke randomNext

If maximum length criteria is reached but final node is not, cut off the generation and proceed to the final node. This function takes Markov::Random::RandomEngine as a parameter to generate pseudo random numbers from

This library is shipped with two random engines, Marsaglia and Mersenne. While mersenne output is higher in entropy, most use cases don't really need super high entropy output, so Markov::Random::Marsaglia is preferable for better performance.

This function WILL NOT reallocate buffer. Make sure no out of bound writes are happening via maximum length criteria.

Example Use: Generate 10 lines, with 5 to 10 characters, and print the output. Use Marsaglia

Model.import("model.mdl");
char* res = new char[11];
Markov::Random::Marsaglia MarsagliaRandomEngine;
for (int i = 0; i < 10; i++) {
this->RandomWalk(&MarsagliaRandomEngine, 5, 10, res);
std::cout << res << "\n";
}
Model()
Initialize a model with only start and end nodes.
Definition: model.h:210
Parameters
randomEngineRandom Engine to use for the random walks. For examples, see Markov::Random::Mersenne and Markov::Random::Marsaglia
minSettingMinimum number of characters to generate
maxSettingMaximum number of character to generate
bufferbuffer to write the result to
Returns
Null terminated string that was generated.

Definition at line 86 of file model.h.

307  {
309  int len = 0;
311  while (true) {
312  temp_node = n->RandomNext(randomEngine);
313  if (len >= maxSetting) {
314  break;
315  }
316  else if ((temp_node == NULL) && (len < minSetting)) {
317  continue;
318  }
319 
320  else if (temp_node == NULL){
321  break;
322  }
323 
324  n = temp_node;
325 
326  buffer[len++] = n->NodeValue();
327  }
328 
329  //null terminate the string
330  buffer[len] = 0x00;
331 
332  //do something with the generated string
333  return buffer; //for now
334 }
Node< storageType > * RandomNext(Markov::Random::RandomEngine *randomEngine)
Chose a random node from the list of edges, with regards to its EdgeWeight, and TraverseNode to that.
Definition: node.h:234

◆ Save()

std::ofstream * Markov::API::MarkovPasswords::Save ( const char *  filename)
inherited

Export model to file.

Parameters
filename- Export filename.
Returns
std::ofstream* of the exported file.

Definition at line 106 of file markovPasswords.cpp.

106  {
107  std::ofstream* exportFile;
108 
109  std::ofstream newFile(filename);
110 
111  exportFile = &newFile;
112 
113  this->Export(exportFile);
114  return exportFile;
115 }

References Markov::Model< NodeStorageType >::Export().

Here is the call graph for this function:

◆ StarterNode()

Node<char >* Markov::Model< char >::StarterNode ( )
inlineinherited

Return starter Node.

Returns
starter node with 00 NodeValue

Definition at line 171 of file model.h.

171 { return starterNode;}

◆ Train()

void Markov::API::ModelMatrix::Train ( const char *  datasetFileName,
char  delimiter,
int  threads 
)

Train the model with the dataset file.

Parameters
datasetFileName- Ifstream* to the dataset. If null, use class member
delimiter- a character, same as the delimiter in dataset content
threads- number of OS threads to spawn
mp.Import("models/2gram.mdl");
mp.Train("password.corpus");
Markov::Model with char represented nodes.

Construct the matrix when done.

Definition at line 25 of file modelMatrix.cpp.

25  {
26  this->DeallocateMatrix();
27  this->Markov::API::MarkovPasswords::Train(datasetFileName,delimiter,threads);
28  this->ConstructMatrix();
29 }
void Train(const char *datasetFileName, char delimiter, int threads)
Train the model with the dataset file.

References ConstructMatrix(), DeallocateMatrix(), and Markov::API::MarkovPasswords::Train().

Referenced by Markov::Markopy::CUDA::BOOST_PYTHON_MODULE(), and Markov::Markopy::BOOST_PYTHON_MODULE().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ TrainThread()

void Markov::API::MarkovPasswords::TrainThread ( Markov::API::Concurrency::ThreadSharedListHandler listhandler,
char  delimiter 
)
privateinherited

A single thread invoked by the Train function.

Parameters
listhandler- Listhandler class to read corpus from
delimiter- a character, same as the delimiter in dataset content

Definition at line 85 of file markovPasswords.cpp.

85  {
86  char format_str[] ="%ld,%s";
87  format_str[3]=delimiter;
88  std::string line;
89  while (listhandler->next(&line) && keepRunning) {
90  long int oc;
91  if (line.size() > 100) {
92  line = line.substr(0, 100);
93  }
94  char* linebuf = new char[line.length()+5];
95 #ifdef _WIN32
96  sscanf_s(line.c_str(), "%ld,%s", &oc, linebuf, line.length()+5); //<== changed format_str to-> "%ld,%s"
97 #else
98  sscanf(line.c_str(), format_str, &oc, linebuf);
99 #endif
100  this->AdjustEdge((const char*)linebuf, oc);
101  delete linebuf;
102  }
103 }
bool next(std::string *line)
Read the next line from the file.
static volatile int keepRunning

References Markov::Model< NodeStorageType >::AdjustEdge(), keepRunning, and Markov::API::Concurrency::ThreadSharedListHandler::next().

Referenced by Markov::API::MarkovPasswords::Train().

Here is the call graph for this function:
Here is the caller graph for this function:

Member Data Documentation

◆ datasetFile

std::ifstream* Markov::API::MarkovPasswords::datasetFile
privateinherited

Definition at line 123 of file markovPasswords.h.

◆ edgeMatrix

char** Markov::API::ModelMatrix::edgeMatrix
protected

2-D Character array for the edge Matrix (The characters of Nodes)

Definition at line 175 of file modelMatrix.h.

Referenced by ConstructMatrix(), DeallocateMatrix(), DumpJSON(), and FastRandomWalkThread().

◆ edges

std::vector<Edge<char >*> Markov::Model< char >::edges
privateinherited

A list of all edges in this model.

Definition at line 204 of file model.h.

◆ matrixIndex

char* Markov::API::ModelMatrix::matrixIndex
protected

to hold the Matrix index (To hold the orders of 2-D arrays')

Definition at line 190 of file modelMatrix.h.

Referenced by ConstructMatrix(), DeallocateMatrix(), DumpJSON(), and FastRandomWalkThread().

◆ matrixSize

int Markov::API::ModelMatrix::matrixSize
protected

◆ modelSavefile

std::ofstream* Markov::API::MarkovPasswords::modelSavefile
privateinherited

Dataset file input of our system

Definition at line 124 of file markovPasswords.h.

◆ nodes

std::map<char , Node<char >*> Markov::Model< char >::nodes
privateinherited

Map LeftNode is the Nodes NodeValue Map RightNode is the node pointer.

Definition at line 193 of file model.h.

◆ outputFile

std::ofstream* Markov::API::MarkovPasswords::outputFile
privateinherited

File to save model of our system

Definition at line 125 of file markovPasswords.h.

◆ ready

bool Markov::API::ModelMatrix::ready
protected

True when matrix is constructed. False if not.

Definition at line 200 of file modelMatrix.h.

Referenced by ConstructMatrix(), DeallocateMatrix(), and ModelMatrix().

◆ starterNode

Node<char >* Markov::Model< char >::starterNode
privateinherited

Starter Node of this model.

Definition at line 198 of file model.h.

◆ totalEdgeWeights

long int* Markov::API::ModelMatrix::totalEdgeWeights
protected

Array of the Total Edge Weights.

Definition at line 195 of file modelMatrix.h.

Referenced by ConstructMatrix(), DeallocateMatrix(), and FastRandomWalkThread().

◆ valueMatrix

long int** Markov::API::ModelMatrix::valueMatrix
protected

2-d Integer array for the value Matrix (For the weights of Edges)

Definition at line 180 of file modelMatrix.h.

Referenced by ConstructMatrix(), DeallocateMatrix(), DumpJSON(), and FastRandomWalkThread().


The documentation for this class was generated from the following files: