US20220198471A1 - Graph traversal for measurement of fraudulent nodes - Google Patents
Graph traversal for measurement of fraudulent nodes Download PDFInfo
- Publication number
- US20220198471A1 US20220198471A1 US17/553,265 US202117553265A US2022198471A1 US 20220198471 A1 US20220198471 A1 US 20220198471A1 US 202117553265 A US202117553265 A US 202117553265A US 2022198471 A1 US2022198471 A1 US 2022198471A1
- Authority
- US
- United States
- Prior art keywords
- node
- graph
- nodes
- walks
- traversal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005259 measurement Methods 0.000 title description 3
- 230000000694 effects Effects 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims description 46
- 238000010801 machine learning Methods 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 4
- 238000005295 random walk Methods 0.000 description 64
- 230000008569 process Effects 0.000 description 23
- 238000004458 analytical method Methods 0.000 description 18
- 238000004900 laundering Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 14
- 238000000605 extraction Methods 0.000 description 11
- 238000002372 labelling Methods 0.000 description 10
- 238000007405 data analysis Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 6
- 230000002123 temporal effect Effects 0.000 description 5
- 230000002085 persistent effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 239000004557 technical material Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
- G06Q30/0185—Product, service or business identity fraud
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G06N7/005—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/389—Keeping log of transactions for guaranteeing non-repudiation of a transaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
Definitions
- Data analysis is a process for obtaining raw data and converting it into information useful for informing conclusions or supporting decision-making.
- Typical data analysis steps include collecting data, organizing data, manipulating data, and summarizing data.
- data analysis is performed automatically by computer systems on datasets that are too large and complex for analysis by a human.
- Data analysis of graph data can be challenging, particularly in the context of automated data analysis of large amounts of data. This is the case as automated data analysis of graph data requires novel techniques able to efficiently explore the graph to extract meaningful information under real-world computation and time constraints. Thus, it would be beneficial to develop techniques directed toward robust and efficient characterization of graph data in support of decision-making.
- FIG. 1 is a diagram illustrating an example of a graph with different types of nodes.
- FIG. 2 is a block diagram illustrating an embodiment of a framework for using analyzed graph data along with a machine learning model.
- FIG. 3 is a block diagram illustrating an embodiment of a system for performing graph analysis.
- FIG. 4 is a flow diagram illustrating an embodiment of a process for determining metrics of a graph based on traversal walks of the graph.
- FIG. 5 is a flow diagram illustrating an embodiment of a process for handling unlabeled nodes in a graph.
- FIG. 6 is a flow diagram illustrating an embodiment of a process for automatically performing a traversal walk on a graph from a starting node.
- FIG. 7 is a functional diagram illustrating a programmed computer system.
- the invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor.
- these implementations, or any other form that the invention may take, may be referred to as techniques.
- the order of the steps of disclosed processes may be altered within the scope of the invention.
- a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task.
- the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
- Graph traversal for measurement of fraudulent nodes is disclosed.
- a graph of nodes and edges is received.
- An identification of a starting node in the graph is received.
- Traversal walks on the graph from the starting node are automatically performed, wherein performing each of the traversal walks includes traversing to a randomly selected next node until any of one or more stopping criteria is met.
- One or more processors are used to determine one or more metrics based on the traversal walks. At least a portion of the one or more metrics is used to predict an illicit activity or entity.
- FIG. 1 is a diagram illustrating an example of a graph with different types of nodes.
- a graph refers to a structure comprising a set of objects in which at least some pairs of the objects are related. These objects are referred to as nodes (also called vertices or points) and each of the related pairs of nodes is referred to as an edge (also called a link or line).
- nodes also called vertices or points
- edge also called a link or line
- a graph is depicted in diagrammatic form as a set of dots or circles for the nodes, joined by lines or curves for the edges (e.g., as shown in FIG. 1 ).
- graph 100 is comprised of a plurality of nodes and edges.
- Graph 100 is comprised of three different types of nodes (node type 102 , node type 104 , and node type 106 ), as well as two different edge types (edge type 120 and edge type 122 ), which are described in further detail below.
- nodes in graph 100 are nodes 108 and 110 .
- An example of an edge is edge 112 , which connects nodes 108 and 110 .
- Edge 112 is an example of a directed edge (shown diagrammatically as an arrow from node 108 to 110 ). This may represent a temporal sequence in which node 108 comes before node 110 . Edges may also represent properties other than temporal sequence. It is also possible for edges to be undirected or bidirectional depending on the context.
- graph 100 is analyzed to aid detection of money laundering or other forms of fraud.
- Money laundering is a financial crime that involves the illegal process of obtaining money from criminal activities, such as drug or human trafficking, and making it appear legitimate.
- graph 100 represents a transaction network. In other embodiments, graph 100 represents an entity network.
- nodes in graph 100 represent individual transactions and edges connect transactions that share common entities.
- transactions include a sale, a purchase, or another type of exchange or interaction.
- entities include a person, a business, a governmental group, an account, etc. Entities can take on various roles (e.g., as merchants, clients, etc.).
- edges can take on various roles (e.g., as merchants, clients, etc.).
- edges are only connected if they occur in a specified time-window (e.g., only connect two transactions if they occurred within a 24-hour period).
- edge direction is used to encode the temporal sequence of transactions, with edges connecting older transactions to more recent transactions.
- Timing of transactions can be determined based on timestamps, which are digital records of the time of occurrence of particular events (e.g., as recorded by computers or other electronic devices). It is also possible to connect nodes using bidirectional edges (e.g., to represent two transactions as occurring within a same timestamp).
- edges may be of different types. Different edges can correspond to different entity types in the data (e.g., entity types could be payment card, payment device, IP address, etc.). For example, an edge of type ‘card’ could connect two transactions that share the same payment card, while an edge of type ‘device’ could connect two transactions that share the same device.
- nodes in graph 100 represent entities, such as merchants or clients, and edges connect entities that appear in the same transaction.
- entities such as merchants or clients
- edges connect entities that appear in the same transaction.
- the same two entities are parties to multiple transactions (e.g., a person making several purchases at a certain retail store)
- the information of all the transactions may be aggregated into a single edge, and information about the individual transactions (e.g., number of transactions, average amount, maximum amount etc.) may be included in the features of the respective edge.
- Entity networks can be directed, as illustrated in FIG. 1 (e.g., a directed edge from account A to account B indicating the direction of the flow of money).
- Edges may also be undirected (e.g., an undirected edge recording that a customer made a purchase with a certain device).
- each node in graph 100 is categorized as either “legitimate”, “illicit”, or “unknown”.
- node type 102 , node type 104 , and node type 106 can correspond to the “legitimate”, “illicit”, and “unknown” categories, respectively.
- the determination as to whether a node (corresponding to a transaction or an entity) is legitimate or illicit is based on a category of the entity associated with the node.
- a transaction node can be categorized as legitimate if the entity making the transaction associated with the node falls within a legitimate category.
- examples of legitimate categories may include exchanges, wallet providers, miners, and financial service providers.
- Examples of illicit categories may include scams, malware, terrorist organizations, and Ponzi schemes.
- An entity node can similarly be categorized as legitimate or illicit based on the above categories.
- the determination as to whether a node is legitimate or illicit is based on the label of the entities associated with that node.
- the data may include labels for the edges instead of the nodes. Mapping the labels from edges to nodes can be accomplished by marking a node as illicit if it is connected by at least one illicit edge.
- two transactions that are using the same illicit entity can be represented as two nodes connected by an illicit edge, and can therefore both be categorized as illicit.
- illicit connections are utilized (e.g., whether a transaction partner is connected to an illicit actor, such as someone associated with money laundering in the past or other criminal activity) to aid detection of money laundering or other fraud.
- a distance from a specified node (corresponding to a transaction or entity of interest) to an illicit node is measured to predict a likelihood of money laundering or other fraud associated with the specified node.
- the specified node is selected as a starting node from which traversal walks of a graph are performed.
- each traversal walk starts from the selected starting node and proceeds to a next directly connected node selected at random, continuing until an illicit node is reached or no more steps are possible or another stopping criteria is reached. These traversals are referred to as random walks.
- An example of a random walk for graph 100 would be (assuming node 118 is the starting node) the ordered sequence of nodes: node 118 , node 116 , node 114 , and node 108 (node 108 being a stopping point because it is an illicit node).
- a number of steps to reach the illicit node is recorded (e.g., three steps for the above random walk from node 118 to node 108 ).
- this process is repeated to generate a distribution over the number of steps of traversals upon which various statistics such as means, standard deviations, quantiles, etc. can be computed and extracted as features to be utilized to aid money laundering or other fraud detection.
- graph-extracted features may be utilized in conjunction with other features associated with a transaction or entity that is being analyzed for money laundering, other fraud, etc. Such a framework is described in more detail in FIG. 2 . These graph-extracted features can also be used to train a machine learning model (or used within a rule-based framework) for detecting money laundering or other fraud (e.g., if the mean distance to an illicit node is two or fewer steps, then flag a transaction as suspicious and/or requiring review).
- a random walk process is computationally beneficial and efficient because it is typically not possible to explore every single connection in a graph, particularly for large and/or dense graphs.
- a random walk process allows for efficient sampling of all available paths.
- stopping criteria can be applied to further promote computational efficiency. Random walks can be performed regardless of whether an underlying graph is transaction-based (a transaction network) or entity-based (an entity network).
- transaction data is oftentimes collected and then analyzed offline (not in real time) because a real-time decision is not required in this context.
- FIG. 2 is a block diagram illustrating an embodiment of a framework for using analyzed graph data along with a machine learning model.
- framework 200 includes graph analysis component 204 analyzing graph data 202 to extract graph features 206 , which are inputted to machine learning model 208 .
- Machine learning model 208 uses graph features 206 along with input 210 to determine output 212 .
- graph data 202 is comprised of transaction data in graph form.
- Graph 100 of FIG. 1 is an example of graph data that may be included in graph data 202 .
- Graph data 202 can include graphs in which nodes represent transactions and/or graphs in which nodes represent entities.
- nodes and edges of graphs included in graph data 202 are labeled. For example, nodes may be labeled as legitimate or illicit.
- graph analysis component 204 receives graph data 202 .
- graph analysis component 204 performs random walks for a given transaction graph and derives various metrics based on the random walks.
- FIG. 3 illustrates an example structure for graph analysis component 204 .
- the example structure includes random walker module 302 , feature extraction module 304 , and labeling module 306 .
- the various modules may be different software or hardware sub-units.
- a random walker module receives a transaction graph, a set of seed nodes, and a parameter specifying a number of desired random walks for each of the seed nodes. Other parameters may also be received.
- the random walker module samples random walks for each seed node.
- graph traversal is limited to traveling backward in time (to past nodes) and traversal stops at a first illicit node found or when there are no more valid nodes to visit.
- a feature extraction module e.g., feature extraction module 304 of FIG.
- the random walker module receives the random walks for the set of seed nodes and computes aggregated features that summarize the random walks, e.g., average number of steps needed to reach one illicit node, total number of different illicit nodes found, and so forth. Various other metrics are typically also computed.
- the random walker module relies on labels in graph data 202 to determine illicit node stopping points.
- a labeling module e.g., labeling module 306 of FIG. 3
- estimates labels for nodes e.g., using a machine learning or a rules-based approach.
- the random walker, feature extraction, and labeling modules are described in further detail below (e.g., see FIG. 3 ).
- the output of the feature extraction module is graph features 206 .
- Graph features 206 can include various features associated with random walk size. Stated alternatively, graph features 206 can include various metrics based on traversal walks and related to distance between seed nodes and found illicit nodes. These metrics are described in further detail below (e.g., see FIG. 3 ).
- graph features 206 are utilized by machine learning model 208 in conjunction with input 210 . Graph features 206 can be used to enrich input data of input 210 to improve the performance of machine learning model 208 .
- machine learning model 208 is configured to predict whether a given transaction is associated with money laundering or other fraud.
- data included in input 210 may be various properties associated with the given transaction to be analyzed for money laundering or other fraud. Examples of these properties include transaction amount, transaction time, date, and location, etc.
- Graph features 206 can enrich input 210 , for example, by providing information regarding how an entity associated with the given transaction is connected to illicit actors via a node distance metric corresponding to a distance between the entity and the closest illicit actor.
- graph features 206 and input 210 are encoded as numerical vectors. Non-numerical properties of input 210 can be converted to numerical representations.
- Machine learning model 208 determines output 212 based on graph features 206 and input 210 .
- output 212 includes a binary (yes or no) determination (e.g., 0 or 1 result) as to whether the given transaction is associated with money laundering or other fraud.
- forms of fraud other than money laundering include account takeover, account-opening fraud, identity theft, and so forth.
- output 212 can take the form of a score (e.g., indicating a probability of money laundering or other fraud).
- Output 212 can also include an analysis of various features utilized by machine learning model 208 that contributed to a money laundering or other fraud prediction by machine learning model 208 (e.g., indicating which features contributed most to the prediction).
- Graph features 206 may also be utilized by a graph visualization tool that also displays output 212 to a human analyst. Visualization of graph features 206 can provide the human analyst with more information with which to review the given transaction or entity.
- Machine learning model 208 is a model that is built based on sample data (training data) in order to make its predictions or decisions without being explicitly programmed to do so. Various architectures are possible for machine learning model 208 . In some embodiments, machine learning model 208 is a random forest model. It is also possible for machine learning model 208 to utilize a neural network or other types of architectures.
- FIG. 3 is a block diagram illustrating an embodiment of a system for performing graph analysis.
- graph analysis component 300 is graph analysis component 204 of FIG. 2 .
- graph analysis component 300 includes random walker module 302 , feature extraction module 304 , and labeling module 306 .
- graph analysis component 300 (including its sub-components) is comprised of computer program instructions that are executed on a general-purpose processor, e.g., a central processing unit (CPU), of a programmed computer system.
- FIG. 7 illustrates an example of a programmed computer system. It is also possible for the logic of graph analysis component 300 to be executed on other hardware, e.g., executed using an application-specific integrated circuit (ASIC) or a field programmable gate array (FPGA).
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- random walker module 302 performs traversal walks on a received graph (e.g., graph 100 of FIG. 1 ).
- the nodes in the received graph can correspond to transactions.
- the nodes can also correspond to entities.
- random walker module 302 receives as input an original transaction graph G, a list of seed nodes, S c V (G), and a number of successful random walks desired, k. Successful random walks are described below.
- random walker module 302 due to the temporal nature of the graph, random walker module 302 only traverses backward in time. Stated alternatively, it only traverses from node x i to node x j if x j represents a transaction older than x i (this will be referred to as the ‘all backwards scenario’).
- the network can be represented as a directed graph connecting older nodes to newer nodes by an edge directed from the older to the newer node.
- random walker module 302 can traverse to all nodes that are older than the seed node. Stated alternatively, it only traverses from node x i to node x j if x j represents a transaction older than s (this will be referred to as the ‘seed backwards scenario’). Node x j could therefore be either in the past or in the future relative to node x i in the seed backwards scenario. To avoid endless loops in this scenario, the same node cannot be chosen more than once in the same random walk.
- random walker module 302 selects a node uniformly at random from the eligible neighbors of the current node. In the ‘all backwards scenario’ described above, random walker module 302 selects a node uniformly at random from the incoming neighbors of the current node. Because edges only connect older transactions to newer transactions, there is always an end node in any random walk, which means that random walker module 302 will not be stuck in an endless loop.
- random walker module 302 selects a node uniformly at random from all neighbors (incoming and outgoing) of the current node, provided that they are older than the seed node s and have not yet been selected in the current random walk, again avoiding endless loops.
- the random walk stops and returns X i S as the final random walk if a stopping criterion is met.
- the random walk stops if at least one of the following criteria is met: 1) x n is a known illicit node (corresponding to a known illicit transaction or entity) or 2) the set of eligible nodes to select from is empty (for example, in the ‘all backwards scenario’ described above, such a scenario occurs when a given node has no incoming neighbors and, consequently, random walker module 302 has no possible moves).
- the random walk also stops if a maximum walk size (an adjustable parameter) has been reached. The maximum walk size parameter is useful to avoid being stuck in large walks and, thus, reduces computation time, particularly when performing many random walks for each seed. After a node is selected, if no stopping criteria is met, then random walker module 302 randomly picks a next node to add to X i S and the random walk continues.
- the number of successful random walks parameter (e.g., supplied as a user input) sets a desired number of random walks ending in an illicit node from each seed node s E S. Unsuccessful random walks are possible because, as discussed above, random walker module 302 may encounter a node with no eligible neighbors and finish traversing a graph without reaching an illicit node. In various embodiments, to ensure the number of desired successful random walks is reached, random walker module 302 performs as many random walks as needed. As described below, a feature (metric) that can be extracted is the fraction of successful random walks from the total number of random walks needed to reach the number of successful random walks. This is also indicative of the number of unsuccessful random walks, which may be informative.
- random walker module 302 can first determine which nodes can actually reach an illicit node. One way to do this is to invert the direction of the graph and use a descendant search algorithm for directed acyclic graphs to return all nodes reachable from a source node s in the graph G. Afterwards, the reachable nodes can be inspected to determine if at least one of them is illicit. This can be done for all nodes in a graph, and only those that can reach an illicit node are allowed to be seed nodes.
- feature extraction module 304 extracts features (also referred to as metrics) based on random walks performed by random walker module 302 .
- feature extraction module 304 receives a list of random walks for each seed node and returns a data frame of features, corresponding to the transaction or entity associated with the seed node, summarizing the random walks.
- Examples of features that can be obtained are: 1) minimum size of the random walks (min); 2) maximum size of the random walks (max); 3) mean size of the random walks (mean); 4) standard deviation of the random walks sizes (std); 5) median size of the random walks (median); 6) first quartile of the random walks sizes (q25); 7) third quartile of the random walks sizes (q75); 8) fraction of successful random walks from all the random walks performed (hit rate, i.e., percentage of random walks that ended in an illicit node); and 9) number of distinct illicit nodes in the random walks (illicit) as well as the number of legitimate nodes. For example, from a specified seed node, if 100 random walks are performed and 30 reached an illicit node, the hit rate would be 30%.
- a minimum walk size to reach an illicit node a maximum walk size to reach an illicit node, corresponding mean, standard deviation, median, quartiles, and number of illicit nodes found would also be computable. It is also possible to report which nodes have no possible path to an illicit node.
- Features 1) through 7) can be considered distance metrics because they are based on measuring the number of steps to reach an illicit node from the seed node.
- nodes are labeled with scores indicating likelihoods of being illicit (instead of binary legitimate or illicit classifications).
- these likelihood scores can be added up during random walks, and a cumulative score for each random walk can be recorded.
- Minimum score, maximum score, mean score, standard deviation, median score, first quartile, and third quartile metrics can be computed based on likelihood scores.
- Nodes with scores above a specified threshold (indicating a specified likelihood of being illicit) can be used as stopping points for random walks, or a stopping criterion can be used where a stopping point occurs when the accumulated scores of all nodes in the current random walk exceed a specific threshold. This allows the hit rate and number of distinct illicit nodes metrics to be calculated in the same way as described above.
- labeling module 306 supplies node labels to random walker module 302 . This can occur when node labels are missing from a graph that random walker module 302 receives.
- nodes of node type 106 in FIG. 1 are nodes missing labels. Because terminating a successful random walk depends on reaching a node known to be illicit, labels indicating which nodes are legitimate and illicit are needed. In many scenarios, labeling module 306 is not needed because human-generated labels are available for all nodes (e.g., when there is sufficient time for human analysts to review and label nodes).
- labeling module 306 utilizes a machine learning model that is trained to predict labels for recent nodes that do not have labels.
- a machine learning model may be a neural network model that relies on various node properties (e.g., profession, location, etc. of an entity for an entity node or transaction amount, location, etc. for a transaction node) to determine whether a node is likely to be legitimate or illicit.
- Machine learning model generated labels can be considered soft or pseudo-labels to be used until ground truth labels created by a human analyst are available. Another approach is to temporarily label new nodes as legitimate by default. It is also possible to utilize a rules-based approach (without using a machine learning model) to generate labels. For example, if a transaction amount is more than a specified threshold and/or an entity is from a specified list of countries, then the corresponding node may be labeled as suspicious and act as a stopping point in a random walk. It is also possible to use human labelers to generate labels.
- FIG. 4 is a flow diagram illustrating an embodiment of a process for determining metrics of a graph based on traversal walks of the graph. In some embodiments, at least a portion of the process of FIG. 4 is performed by graph analysis component 204 of FIG. 2 and/or graph analysis component 300 of FIG. 3 .
- a graph of nodes and edges is received.
- An example of a type of graph that may be received is graph 100 of FIG. 1 .
- the graph represents a network of transactions. Stated alternatively, each node in such a graph represents a transaction and edges connect nodes whose transactions share common entities conducting the transactions. In other embodiments, the graph represents a network of entities. Stated alternatively, each node in such a graph represents a specific entity (e.g., a person, business organization, etc.) and edges connect nodes whose entities share a common transaction.
- the graph is received by random walker module 302 of FIG. 3 .
- an identification of a starting node in the graph is received.
- the starting node is also referred to as a seed node.
- the identification of the starting node is provided by a user.
- the starting node can be a node that the user desires a prediction as to whether it is legitimate or illicit.
- the starting node is one of a plurality of starting nodes provided. Stated alternatively, a set of seed nodes may be provided, from which one seed node at a time is utilized.
- the identification of the starting node is received by random walker module 302 of FIG. 3 .
- traversal walks on the graph from the starting node are automatically performed. Beginning from the starting node, subsequent nodes are selected automatically and randomly without user input.
- performing each of the traversal walks includes traversing to a randomly selected next node until any of one or more stopping criteria is met. Examples of stopping criteria include: 1) reaching a node labeled as illicit; 2) not being able to traverse to any more nodes; and 3) reaching a maximum walk size.
- stopping criteria include: 1) reaching a node labeled as illicit; 2) not being able to traverse to any more nodes; and 3) reaching a maximum walk size.
- the path of the random walk in the form of a list of nodes traversed is generated.
- the traversal walks are performed by random walker module 302 of FIG. 3 .
- the lists of nodes traversed for the traversal walks are transmitted to feature extraction module 304 for analysis.
- each traversal walk is encoded as a sequence of nodes.
- the length of the sequence of nodes corresponds to a distance between the starting node and an ending node, which can be considered a size of a random walk.
- successful walks random walks that reach an illicit node
- traversal walk size such as maximum size, minimum size, average size, size standard deviation, size quartiles, and median size.
- Successful walks can also be analyzed to determine how many distinct illicit nodes were found (unsuccessful walks would not be needed because they would not include illicit nodes).
- unsuccessful walks are used to compute metrics such as fraction of successful walks (out of a total number of walks), ratio of successful to unsuccessful walks, etc.
- the one or more metrics are determined by feature extraction module 304 of FIG. 3 .
- the one or more metrics are used to predict an illicit activity or entity.
- the one or more metrics are transmitted to a machine learning model (e.g., machine learning model 208 of FIG. 2 ) that predicts the illicit activity or entity based on at least some of the one or more metrics and additional inputs to the machine learning model.
- the machine learning model may predict whether a specific activity (e.g., a specific purchase transaction between a buyer and seller) is illicit.
- the specific purchase transaction may be associated with a seller corresponding to the starting node in the graph.
- the one or more metrics can include distance metrics (e.g., mean distance from the starting node to an illicit node corresponding to an illicit actor) that can aid the machine learning model in determining whether the specific purchase transaction is illicit.
- the machine learning model may also be tasked with directly determining whether a specific entity is illicit based on distance metrics corresponding to how closely the specific entity is to illicit actors (in terms of node distances in a graph in which the specific entity corresponds to the starting node).
- FIG. 5 is a flow diagram illustrating an embodiment of a process for handling unlabeled nodes in a graph.
- the process of FIG. 5 is performed by graph analysis component 300 of FIG. 3 .
- at least a portion of the process of FIG. 5 is performed in 402 of FIG. 4 .
- a node is received.
- the node is one of multiple nodes received by random walker module 302 of FIG. 3 .
- a label for the node is generated using a machine learning model (e.g., a neural network model as discussed with respect to labeling module 306 of FIG. 3 ). It is also possible to use labeling approaches that do not rely on a machine learning model (e.g., rules-based approaches). If at 504 it is determined that the node is labeled, 506 is not performed.
- a machine learning model e.g., a neural network model as discussed with respect to labeling module 306 of FIG. 3 . It is also possible to use labeling approaches that do not rely on a machine learning model (e.g., rules-based approaches). If at 504 it is determined that the node is labeled, 506 is not performed.
- At 508 it is determined whether there are more nodes to analyze. If at 508 it is determined that there are more nodes to analyze, at 502 , another node is received in order to determine whether that node is labeled. If at 508 it is determined that there are no more nodes to analyze, then no further action is taken.
- FIG. 6 is a flow diagram illustrating an embodiment of a process for automatically performing a traversal walk on a graph from a starting node.
- the traversal walk performed is one of a plurality of traversal walks performed in order to generate a distribution of traversal walks.
- the process of FIG. 6 is performed by random walker module 302 of FIG. 3 .
- at least a portion of the process of FIG. 6 is performed in 406 of FIG. 4 .
- a node is arrived at and a list of nodes is updated.
- the node may be a starting node of a sequence of nodes, an ending node in the sequence of nodes, or an intermediate node in a path of nodes from the starting node to the ending node.
- the list of nodes is a running sequence of nodes beginning with the starting node.
- the list of nodes is updated to include the node that is arrived at, and the list of nodes is continually updated until after the ending node is reached. The ending node is reached if at least one stopping criterion is met.
- At 604 it is determined whether at least one stopping criterion is met.
- An example of a stopping criterion is the node that is arrived at being an illicit node.
- Another example of a stopping criterion is the node that is arrived at being a terminal node (that is not an illicit node) that has no eligible neighboring nodes.
- a neighboring node to traverse to is randomly selected. In various embodiments, the random selection is uniform in the sense that each eligible neighboring node has an equal probability of being selected.
- the neighboring node that is randomly selected then becomes the next arrival node at 602 .
- the list of nodes is updated to reflect the addition of the randomly selected neighboring node to the sequence of nodes from the starting node to the ending node.
- the list of nodes is provided. Meeting the stopping criterion causes the traversal walk to terminate.
- the list of nodes which describes the sequence of nodes visited in the traversal walk, is transmitted to feature extraction module 304 of FIG. 3 .
- FIG. 7 is a functional diagram illustrating a programmed computer system.
- processes associated with framework 200 of FIG. 2 and/or graph analysis component 300 of FIG. 3 are executed by computer system 700 .
- the processes of FIGS. 4, 5 , and/or 6 are executed by computer system 700 .
- Computer system 700 includes various subsystems as described below.
- Computer system 700 includes at least one microprocessor subsystem (also referred to as a processor or a central processing unit (CPU)) 702 .
- Computer system 700 can be physical or virtual (e.g., a virtual machine).
- processor 702 can be implemented by a single-chip processor or by multiple processors.
- processor 702 is a general-purpose digital processor that controls the operation of computer system 700 . Using instructions retrieved from memory 710 , processor 702 controls the reception and manipulation of input data, and the output and display of data on output devices (e.g., display 718 ).
- Processor 702 is coupled bi-directionally with memory 710 , which can include a first primary storage, typically a random-access memory (RAM), and a second primary storage area, typically a read-only memory (ROM).
- primary storage can be used as a general storage area and as scratch-pad memory, and can also be used to store input data and processed data.
- Primary storage can also store programming instructions and data, in the form of data objects and text objects, in addition to other data and instructions for processes operating on processor 702 .
- primary storage typically includes basic operating instructions, program code, data, and objects used by processor 702 to perform its functions (e.g., programmed instructions).
- memory 710 can include any suitable computer-readable storage media, described below, depending on whether, for example, data access needs to be bi-directional or uni-directional.
- processor 702 can also directly and very rapidly retrieve and store frequently needed data in a cache memory (not shown).
- Persistent memory 712 (e.g., a removable mass storage device) provides additional data storage capacity for computer system 700 , and is coupled either bi-directionally (read/write) or uni-directionally (read only) to processor 702 .
- persistent memory 712 can also include computer-readable media such as magnetic tape, flash memory, PC-CARDS, portable mass storage devices, holographic storage devices, and other storage devices.
- a fixed mass storage 720 can also, for example, provide additional data storage capacity. The most common example of fixed mass storage 720 is a hard disk drive.
- Persistent memory 712 and fixed mass storage 720 generally store additional programming instructions, data, and the like that typically are not in active use by the processor 702 . It will be appreciated that the information retained within persistent memory 712 and fixed mass storage 720 can be incorporated, if needed, in standard fashion as part of memory 710 (e.g., RAM) as virtual memory.
- bus 714 can also be used to provide access to other subsystems and devices. As shown, these can include a display monitor 718 , a network interface 716 , a keyboard 704 , and a pointing device 706 , as well as an auxiliary input/output device interface, a sound card, speakers, and other subsystems as needed.
- pointing device 706 can be a mouse, stylus, track ball, or tablet, and is useful for interacting with a graphical user interface.
- Network interface 716 allows processor 702 to be coupled to another computer, computer network, or telecommunications network using a network connection as shown.
- processor 702 can receive information (e.g., data objects or program instructions) from another network or output information to another network in the course of performing method/process steps.
- Information often represented as a sequence of instructions to be executed on a processor, can be received from and outputted to another network.
- An interface card or similar device and appropriate software implemented by (e.g., executed/performed on) processor 702 can be used to connect computer system 700 to an external network and transfer data according to standard protocols.
- Processes can be executed on processor 702 , or can be performed across a network such as the Internet, intranet networks, or local area networks, in conjunction with a remote processor that shares a portion of the processing. Additional mass storage devices (not shown) can also be connected to processor 702 through network interface 716 .
- auxiliary I/O device interface (not shown) can be used in conjunction with computer system 700 .
- the auxiliary I/O device interface can include general and customized interfaces that allow processor 702 to send and, more typically, receive data from other devices such as microphones, touch-sensitive displays, transducer card readers, tape readers, voice or handwriting recognizers, biometrics readers, cameras, portable mass storage devices, and other computers.
- various embodiments disclosed herein further relate to computer storage products with a computer readable medium that includes program code for performing various computer-implemented operations.
- the computer-readable medium is any data storage device that can store data which can thereafter be read by a computer system.
- Examples of computer-readable media include, but are not limited to, all the media mentioned above: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as optical disks; and specially configured hardware devices such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs), and ROM and RAM devices.
- Examples of program code include both machine code, as produced, for example, by a compiler, or files containing higher level code (e.g., script) that can be executed using an interpreter.
- the computer system shown in FIG. 7 is but an example of a computer system suitable for use with the various embodiments disclosed herein.
- Other computer systems suitable for such use can include additional or fewer subsystems.
- bus 714 is illustrative of any interconnection scheme serving to link the subsystems.
- Other computer architectures having different configurations of subsystems can also be utilized.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Software Systems (AREA)
- Computational Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Algebra (AREA)
- Economics (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Computer Security & Cryptography (AREA)
- Discrete Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- This application claims priority to U.S. Provisional Patent Application No. 63/127,941 entitled DISTANCE TO FRAUDULENT NODES IN DATASETS filed Dec. 18, 2020, which is incorporated herein by reference for all purposes.
- This application claims priority to Portugal Provisional Patent Application No. 117636 entitled GRAPH TRAVERSAL FOR MEASUREMENT OF FRAUDULENT NODES filed Dec. 14, 2021, which is incorporated herein by reference for all purposes.
- Data analysis is a process for obtaining raw data and converting it into information useful for informing conclusions or supporting decision-making. Typical data analysis steps include collecting data, organizing data, manipulating data, and summarizing data. Oftentimes, data analysis is performed automatically by computer systems on datasets that are too large and complex for analysis by a human. In some scenarios, it is desirable to apply data analysis to graph data to support decision-making based on the graph data. Data analysis of graph data can be challenging, particularly in the context of automated data analysis of large amounts of data. This is the case as automated data analysis of graph data requires novel techniques able to efficiently explore the graph to extract meaningful information under real-world computation and time constraints. Thus, it would be beneficial to develop techniques directed toward robust and efficient characterization of graph data in support of decision-making.
- Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
-
FIG. 1 is a diagram illustrating an example of a graph with different types of nodes. -
FIG. 2 is a block diagram illustrating an embodiment of a framework for using analyzed graph data along with a machine learning model. -
FIG. 3 is a block diagram illustrating an embodiment of a system for performing graph analysis. -
FIG. 4 is a flow diagram illustrating an embodiment of a process for determining metrics of a graph based on traversal walks of the graph. -
FIG. 5 is a flow diagram illustrating an embodiment of a process for handling unlabeled nodes in a graph. -
FIG. 6 is a flow diagram illustrating an embodiment of a process for automatically performing a traversal walk on a graph from a starting node. -
FIG. 7 is a functional diagram illustrating a programmed computer system. - The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
- A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
- Graph traversal for measurement of fraudulent nodes is disclosed. A graph of nodes and edges is received. An identification of a starting node in the graph is received. Traversal walks on the graph from the starting node are automatically performed, wherein performing each of the traversal walks includes traversing to a randomly selected next node until any of one or more stopping criteria is met. One or more processors are used to determine one or more metrics based on the traversal walks. At least a portion of the one or more metrics is used to predict an illicit activity or entity.
-
FIG. 1 is a diagram illustrating an example of a graph with different types of nodes. A graph refers to a structure comprising a set of objects in which at least some pairs of the objects are related. These objects are referred to as nodes (also called vertices or points) and each of the related pairs of nodes is referred to as an edge (also called a link or line). Oftentimes, a graph is depicted in diagrammatic form as a set of dots or circles for the nodes, joined by lines or curves for the edges (e.g., as shown inFIG. 1 ). In the example illustrated,graph 100 is comprised of a plurality of nodes and edges. Graph 100 is comprised of three different types of nodes (node type 102,node type 104, and node type 106), as well as two different edge types (edge type 120 and edge type 122), which are described in further detail below. This example is illustrative and not restrictive. Different graphs with different node type and edge type compositions are also possible. Examples of nodes ingraph 100 arenodes edge 112, which connectsnodes node 108 to 110). This may represent a temporal sequence in whichnode 108 comes beforenode 110. Edges may also represent properties other than temporal sequence. It is also possible for edges to be undirected or bidirectional depending on the context. - In some embodiments,
graph 100 is analyzed to aid detection of money laundering or other forms of fraud. Money laundering is a financial crime that involves the illegal process of obtaining money from criminal activities, such as drug or human trafficking, and making it appear legitimate. In a money laundering (or other fraud) context, in some embodiments,graph 100 represents a transaction network. In other embodiments,graph 100 represents an entity network. - In a transaction network scenario, nodes in
graph 100 represent individual transactions and edges connect transactions that share common entities. Examples of transactions include a sale, a purchase, or another type of exchange or interaction. Examples of entities include a person, a business, a governmental group, an account, etc. Entities can take on various roles (e.g., as merchants, clients, etc.). In various embodiments, because connecting all transactions that share common entities results in very dense graphs, transactions are only connected if they occur in a specified time-window (e.g., only connect two transactions if they occurred within a 24-hour period). In addition, in some embodiments and as illustrated inFIG. 1 , edge direction is used to encode the temporal sequence of transactions, with edges connecting older transactions to more recent transactions. Timing of transactions can be determined based on timestamps, which are digital records of the time of occurrence of particular events (e.g., as recorded by computers or other electronic devices). It is also possible to connect nodes using bidirectional edges (e.g., to represent two transactions as occurring within a same timestamp). In addition, in some embodiments, and as illustrated inFIG. 1 , edges may be of different types. Different edges can correspond to different entity types in the data (e.g., entity types could be payment card, payment device, IP address, etc.). For example, an edge of type ‘card’ could connect two transactions that share the same payment card, while an edge of type ‘device’ could connect two transactions that share the same device. - In an entity network scenario, nodes in
graph 100 represent entities, such as merchants or clients, and edges connect entities that appear in the same transaction. When the same two entities are parties to multiple transactions (e.g., a person making several purchases at a certain retail store), the information of all the transactions may be aggregated into a single edge, and information about the individual transactions (e.g., number of transactions, average amount, maximum amount etc.) may be included in the features of the respective edge. Entity networks can be directed, as illustrated inFIG. 1 (e.g., a directed edge from account A to account B indicating the direction of the flow of money). Edges may also be undirected (e.g., an undirected edge recording that a customer made a purchase with a certain device). - In some embodiments, each node in
graph 100 is categorized as either “legitimate”, “illicit”, or “unknown”. For example, ingraph 100,node type 102,node type 104, andnode type 106 can correspond to the “legitimate”, “illicit”, and “unknown” categories, respectively. It is also possible for there to be only two types of nodes: “legitimate” and “illicit”, such as in scenarios in which ground truth labels are available for all nodes or in scenarios in which it is possible to determine or estimate labels for all nodes. In some embodiments, the determination as to whether a node (corresponding to a transaction or an entity) is legitimate or illicit is based on a category of the entity associated with the node. For example, a transaction node can be categorized as legitimate if the entity making the transaction associated with the node falls within a legitimate category. For example, within a cryptocurrency context, examples of legitimate categories may include exchanges, wallet providers, miners, and financial service providers. Examples of illicit categories may include scams, malware, terrorist organizations, and Ponzi schemes. An entity node can similarly be categorized as legitimate or illicit based on the above categories. In some embodiments, in a transaction graph scenario, the determination as to whether a node is legitimate or illicit is based on the label of the entities associated with that node. Stated alternatively, the data may include labels for the edges instead of the nodes. Mapping the labels from edges to nodes can be accomplished by marking a node as illicit if it is connected by at least one illicit edge. For example, within a banking context, two transactions that are using the same illicit entity (e.g., an illicit payment card, device, etc.) can be represented as two nodes connected by an illicit edge, and can therefore both be categorized as illicit. In various embodiments, illicit connections are utilized (e.g., whether a transaction partner is connected to an illicit actor, such as someone associated with money laundering in the past or other criminal activity) to aid detection of money laundering or other fraud. - In various embodiments, as described in further detail herein, a distance from a specified node (corresponding to a transaction or entity of interest) to an illicit node is measured to predict a likelihood of money laundering or other fraud associated with the specified node. In various embodiments, the specified node is selected as a starting node from which traversal walks of a graph are performed. In various embodiments, each traversal walk starts from the selected starting node and proceeds to a next directly connected node selected at random, continuing until an illicit node is reached or no more steps are possible or another stopping criteria is reached. These traversals are referred to as random walks. An example of a random walk for
graph 100 would be (assumingnode 118 is the starting node) the ordered sequence of nodes:node 118,node 116,node 114, and node 108 (node 108 being a stopping point because it is an illicit node). In various embodiments, a number of steps to reach the illicit node is recorded (e.g., three steps for the above random walk fromnode 118 to node 108). In various embodiments, this process is repeated to generate a distribution over the number of steps of traversals upon which various statistics such as means, standard deviations, quantiles, etc. can be computed and extracted as features to be utilized to aid money laundering or other fraud detection. These graph-extracted features may be utilized in conjunction with other features associated with a transaction or entity that is being analyzed for money laundering, other fraud, etc. Such a framework is described in more detail inFIG. 2 . These graph-extracted features can also be used to train a machine learning model (or used within a rule-based framework) for detecting money laundering or other fraud (e.g., if the mean distance to an illicit node is two or fewer steps, then flag a transaction as suspicious and/or requiring review). - A random walk process is computationally beneficial and efficient because it is typically not possible to explore every single connection in a graph, particularly for large and/or dense graphs. A random walk process allows for efficient sampling of all available paths. Furthermore, as described in further detail herein, stopping criteria can be applied to further promote computational efficiency. Random walks can be performed regardless of whether an underlying graph is transaction-based (a transaction network) or entity-based (an entity network). In the money laundering application context, transaction data is oftentimes collected and then analyzed offline (not in real time) because a real-time decision is not required in this context.
-
FIG. 2 is a block diagram illustrating an embodiment of a framework for using analyzed graph data along with a machine learning model. In the example illustrated,framework 200 includesgraph analysis component 204 analyzinggraph data 202 to extract graph features 206, which are inputted tomachine learning model 208.Machine learning model 208 uses graph features 206 along withinput 210 to determineoutput 212. - In various embodiments,
graph data 202 is comprised of transaction data in graph form.Graph 100 ofFIG. 1 is an example of graph data that may be included ingraph data 202.Graph data 202 can include graphs in which nodes represent transactions and/or graphs in which nodes represent entities. In various embodiments, nodes and edges of graphs included ingraph data 202 are labeled. For example, nodes may be labeled as legitimate or illicit. - In the example illustrated,
graph analysis component 204 receivesgraph data 202. In various embodiments,graph analysis component 204 performs random walks for a given transaction graph and derives various metrics based on the random walks.FIG. 3 illustrates an example structure forgraph analysis component 204. In the example shown inFIG. 3 , the example structure includesrandom walker module 302,feature extraction module 304, andlabeling module 306. The various modules may be different software or hardware sub-units. - In various embodiments, a random walker module (e.g.,
random walker module 302 ofFIG. 3 ), receives a transaction graph, a set of seed nodes, and a parameter specifying a number of desired random walks for each of the seed nodes. Other parameters may also be received. The random walker module samples random walks for each seed node. In various embodiments, due to the graph being temporal in nature, graph traversal is limited to traveling backward in time (to past nodes) and traversal stops at a first illicit node found or when there are no more valid nodes to visit. In various embodiments, a feature extraction module (e.g.,feature extraction module 304 ofFIG. 3 ) receives the random walks for the set of seed nodes and computes aggregated features that summarize the random walks, e.g., average number of steps needed to reach one illicit node, total number of different illicit nodes found, and so forth. Various other metrics are typically also computed. The random walker module relies on labels ingraph data 202 to determine illicit node stopping points. In some embodiments in which labels are missing, a labeling module (e.g.,labeling module 306 ofFIG. 3 ) estimates labels for nodes (e.g., using a machine learning or a rules-based approach). The random walker, feature extraction, and labeling modules are described in further detail below (e.g., seeFIG. 3 ). - In various embodiments, the output of the feature extraction module is graph features 206. Graph features 206 can include various features associated with random walk size. Stated alternatively, graph features 206 can include various metrics based on traversal walks and related to distance between seed nodes and found illicit nodes. These metrics are described in further detail below (e.g., see
FIG. 3 ). Withinframework 200, graph features 206 are utilized bymachine learning model 208 in conjunction withinput 210. Graph features 206 can be used to enrich input data ofinput 210 to improve the performance ofmachine learning model 208. In various embodiments,machine learning model 208 is configured to predict whether a given transaction is associated with money laundering or other fraud. Within this context, data included ininput 210 may be various properties associated with the given transaction to be analyzed for money laundering or other fraud. Examples of these properties include transaction amount, transaction time, date, and location, etc. Graph features 206 can enrichinput 210, for example, by providing information regarding how an entity associated with the given transaction is connected to illicit actors via a node distance metric corresponding to a distance between the entity and the closest illicit actor. In various embodiments, graph features 206 andinput 210 are encoded as numerical vectors. Non-numerical properties ofinput 210 can be converted to numerical representations. -
Machine learning model 208 determinesoutput 212 based on graph features 206 andinput 210. For a given transaction, in various embodiments,output 212 includes a binary (yes or no) determination (e.g., 0 or 1 result) as to whether the given transaction is associated with money laundering or other fraud. Forms of fraud other than money laundering include account takeover, account-opening fraud, identity theft, and so forth. It is also possible foroutput 212 to take the form of a score (e.g., indicating a probability of money laundering or other fraud).Output 212 can also include an analysis of various features utilized bymachine learning model 208 that contributed to a money laundering or other fraud prediction by machine learning model 208 (e.g., indicating which features contributed most to the prediction). Graph features 206 may also be utilized by a graph visualization tool that also displaysoutput 212 to a human analyst. Visualization of graph features 206 can provide the human analyst with more information with which to review the given transaction or entity.Machine learning model 208 is a model that is built based on sample data (training data) in order to make its predictions or decisions without being explicitly programmed to do so. Various architectures are possible formachine learning model 208. In some embodiments,machine learning model 208 is a random forest model. It is also possible formachine learning model 208 to utilize a neural network or other types of architectures. -
FIG. 3 is a block diagram illustrating an embodiment of a system for performing graph analysis. In some embodiments,graph analysis component 300 isgraph analysis component 204 ofFIG. 2 . In the example illustrated,graph analysis component 300 includesrandom walker module 302,feature extraction module 304, andlabeling module 306. In some embodiments, graph analysis component 300 (including its sub-components) is comprised of computer program instructions that are executed on a general-purpose processor, e.g., a central processing unit (CPU), of a programmed computer system.FIG. 7 illustrates an example of a programmed computer system. It is also possible for the logic ofgraph analysis component 300 to be executed on other hardware, e.g., executed using an application-specific integrated circuit (ASIC) or a field programmable gate array (FPGA). - In various embodiments,
random walker module 302 performs traversal walks on a received graph (e.g.,graph 100 ofFIG. 1 ). The nodes in the received graph can correspond to transactions. The nodes can also correspond to entities. In some embodiments,random walker module 302 receives as input an original transaction graph G, a list of seed nodes, S c V (G), and a number of successful random walks desired, k. Successful random walks are described below.Random walker module 302's output for each seed node s E S is a list of sampled random walks XS={X1 S, X2 S, . . . , Xk Sc}. A random walk Xi S has a sequence of nodes Xi S=(x1, x2, . . . , xn), (Equation 1) such that x1=s. In many scenarios, due to the temporal nature of the graph,random walker module 302 only traverses backward in time. Stated alternatively, it only traverses from node xi to node xj if xj represents a transaction older than xi (this will be referred to as the ‘all backwards scenario’). For example, in a transaction network scenario, the network can be represented as a directed graph connecting older nodes to newer nodes by an edge directed from the older to the newer node. In such a transaction network, this means that (for a next node) only incoming neighbor nodes, with a timestamp lower than the timestamp of the current node, can be selected. In many scenarios,random walker module 302 can traverse to all nodes that are older than the seed node. Stated alternatively, it only traverses from node xi to node xj if xj represents a transaction older than s (this will be referred to as the ‘seed backwards scenario’). Node xj could therefore be either in the past or in the future relative to node xi in the seed backwards scenario. To avoid endless loops in this scenario, the same node cannot be chosen more than once in the same random walk. - In various embodiments, during a random walk, to select a next node after a current node,
random walker module 302 selects a node uniformly at random from the eligible neighbors of the current node. In the ‘all backwards scenario’ described above,random walker module 302 selects a node uniformly at random from the incoming neighbors of the current node. Because edges only connect older transactions to newer transactions, there is always an end node in any random walk, which means thatrandom walker module 302 will not be stuck in an endless loop. In the ‘seed backwards scenario’ described above,random walker module 302 selects a node uniformly at random from all neighbors (incoming and outgoing) of the current node, provided that they are older than the seed node s and have not yet been selected in the current random walk, again avoiding endless loops. For a given state of a random walk Xi S=(s=x1, x2, . . . , xn), the random walk stops and returns Xi S as the final random walk if a stopping criterion is met. In some embodiments, the random walk stops if at least one of the following criteria is met: 1) xn is a known illicit node (corresponding to a known illicit transaction or entity) or 2) the set of eligible nodes to select from is empty (for example, in the ‘all backwards scenario’ described above, such a scenario occurs when a given node has no incoming neighbors and, consequently,random walker module 302 has no possible moves). In addition, in some embodiments, the random walk also stops if a maximum walk size (an adjustable parameter) has been reached. The maximum walk size parameter is useful to avoid being stuck in large walks and, thus, reduces computation time, particularly when performing many random walks for each seed. After a node is selected, if no stopping criteria is met, thenrandom walker module 302 randomly picks a next node to add to Xi S and the random walk continues. - The number of successful random walks parameter (e.g., supplied as a user input) sets a desired number of random walks ending in an illicit node from each seed node s E S. Unsuccessful random walks are possible because, as discussed above,
random walker module 302 may encounter a node with no eligible neighbors and finish traversing a graph without reaching an illicit node. In various embodiments, to ensure the number of desired successful random walks is reached,random walker module 302 performs as many random walks as needed. As described below, a feature (metric) that can be extracted is the fraction of successful random walks from the total number of random walks needed to reach the number of successful random walks. This is also indicative of the number of unsuccessful random walks, which may be informative. In some scenarios, some nodes in a graph might have no paths to any illicit node. Thus, it is impossible to obtain successful random walks for those nodes. To avoid this problem,random walker module 302 can first determine which nodes can actually reach an illicit node. One way to do this is to invert the direction of the graph and use a descendant search algorithm for directed acyclic graphs to return all nodes reachable from a source node s in the graph G. Afterwards, the reachable nodes can be inspected to determine if at least one of them is illicit. This can be done for all nodes in a graph, and only those that can reach an illicit node are allowed to be seed nodes. - In various embodiments,
feature extraction module 304 extracts features (also referred to as metrics) based on random walks performed byrandom walker module 302. In various embodiments,feature extraction module 304 receives a list of random walks for each seed node and returns a data frame of features, corresponding to the transaction or entity associated with the seed node, summarizing the random walks. Examples of features that can be obtained are: 1) minimum size of the random walks (min); 2) maximum size of the random walks (max); 3) mean size of the random walks (mean); 4) standard deviation of the random walks sizes (std); 5) median size of the random walks (median); 6) first quartile of the random walks sizes (q25); 7) third quartile of the random walks sizes (q75); 8) fraction of successful random walks from all the random walks performed (hit rate, i.e., percentage of random walks that ended in an illicit node); and 9) number of distinct illicit nodes in the random walks (illicit) as well as the number of legitimate nodes. For example, from a specified seed node, if 100 random walks are performed and 30 reached an illicit node, the hit rate would be 30%. Among those 100 walks, a minimum walk size to reach an illicit node, a maximum walk size to reach an illicit node, corresponding mean, standard deviation, median, quartiles, and number of illicit nodes found would also be computable. It is also possible to report which nodes have no possible path to an illicit node. Features 1) through 7) can be considered distance metrics because they are based on measuring the number of steps to reach an illicit node from the seed node. - In some scenarios, nodes are labeled with scores indicating likelihoods of being illicit (instead of binary legitimate or illicit classifications). In these scenarios, these likelihood scores can be added up during random walks, and a cumulative score for each random walk can be recorded. Minimum score, maximum score, mean score, standard deviation, median score, first quartile, and third quartile metrics can be computed based on likelihood scores. Nodes with scores above a specified threshold (indicating a specified likelihood of being illicit) can be used as stopping points for random walks, or a stopping criterion can be used where a stopping point occurs when the accumulated scores of all nodes in the current random walk exceed a specific threshold. This allows the hit rate and number of distinct illicit nodes metrics to be calculated in the same way as described above.
- In some embodiments,
labeling module 306 supplies node labels torandom walker module 302. This can occur when node labels are missing from a graph thatrandom walker module 302 receives. In some embodiments, nodes ofnode type 106 inFIG. 1 are nodes missing labels. Because terminating a successful random walk depends on reaching a node known to be illicit, labels indicating which nodes are legitimate and illicit are needed. In many scenarios,labeling module 306 is not needed because human-generated labels are available for all nodes (e.g., when there is sufficient time for human analysts to review and label nodes). But there may be scenarios in which a node has been recently added for which a label is not yet available (e.g., a new entity is added and there has not been time to review the entity). In order to avoid a significant lag in analyzing a graph with such a recently added and unlabeled node, various approaches can be adopted. In some embodiments,labeling module 306 utilizes a machine learning model that is trained to predict labels for recent nodes that do not have labels. Such a machine learning model may be a neural network model that relies on various node properties (e.g., profession, location, etc. of an entity for an entity node or transaction amount, location, etc. for a transaction node) to determine whether a node is likely to be legitimate or illicit. Machine learning model generated labels can be considered soft or pseudo-labels to be used until ground truth labels created by a human analyst are available. Another approach is to temporarily label new nodes as legitimate by default. It is also possible to utilize a rules-based approach (without using a machine learning model) to generate labels. For example, if a transaction amount is more than a specified threshold and/or an entity is from a specified list of countries, then the corresponding node may be labeled as suspicious and act as a stopping point in a random walk. It is also possible to use human labelers to generate labels. -
FIG. 4 is a flow diagram illustrating an embodiment of a process for determining metrics of a graph based on traversal walks of the graph. In some embodiments, at least a portion of the process ofFIG. 4 is performed bygraph analysis component 204 ofFIG. 2 and/orgraph analysis component 300 ofFIG. 3 . - At 402, a graph of nodes and edges is received. An example of a type of graph that may be received is
graph 100 ofFIG. 1 . In some embodiments, the graph represents a network of transactions. Stated alternatively, each node in such a graph represents a transaction and edges connect nodes whose transactions share common entities conducting the transactions. In other embodiments, the graph represents a network of entities. Stated alternatively, each node in such a graph represents a specific entity (e.g., a person, business organization, etc.) and edges connect nodes whose entities share a common transaction. In some embodiments, the graph is received byrandom walker module 302 ofFIG. 3 . - At 404, an identification of a starting node in the graph is received. The starting node is also referred to as a seed node. In some embodiments, the identification of the starting node is provided by a user. The starting node can be a node that the user desires a prediction as to whether it is legitimate or illicit. In various embodiments, the starting node is one of a plurality of starting nodes provided. Stated alternatively, a set of seed nodes may be provided, from which one seed node at a time is utilized. In some embodiments, the identification of the starting node is received by
random walker module 302 ofFIG. 3 . - At 406, traversal walks on the graph from the starting node are automatically performed. Beginning from the starting node, subsequent nodes are selected automatically and randomly without user input. In various embodiments, performing each of the traversal walks includes traversing to a randomly selected next node until any of one or more stopping criteria is met. Examples of stopping criteria include: 1) reaching a node labeled as illicit; 2) not being able to traverse to any more nodes; and 3) reaching a maximum walk size. In various embodiments, for each successful walk (random walk that ends in reaching an illicit node), the path of the random walk in the form of a list of nodes traversed is generated. In some embodiments, the traversal walks are performed by
random walker module 302 ofFIG. 3 . In various embodiments, the lists of nodes traversed for the traversal walks are transmitted to featureextraction module 304 for analysis. - At 408, one or more metrics are determined based on the traversal walks. In some embodiments, each traversal walk is encoded as a sequence of nodes. The length of the sequence of nodes (the number of nodes in the sequence) corresponds to a distance between the starting node and an ending node, which can be considered a size of a random walk. In various embodiments, successful walks (random walks that reach an illicit node) are compiled into a distribution of successful walks to compute various metrics associated with traversal walk size, such as maximum size, minimum size, average size, size standard deviation, size quartiles, and median size. Successful walks can also be analyzed to determine how many distinct illicit nodes were found (unsuccessful walks would not be needed because they would not include illicit nodes). In various embodiments, unsuccessful walks (random walks that do not reach an illicit node) are used to compute metrics such as fraction of successful walks (out of a total number of walks), ratio of successful to unsuccessful walks, etc. In some embodiments, the one or more metrics are determined by
feature extraction module 304 ofFIG. 3 . - At 410, at least a portion of the one or more metrics are used to predict an illicit activity or entity. In some embodiments, the one or more metrics are transmitted to a machine learning model (e.g.,
machine learning model 208 ofFIG. 2 ) that predicts the illicit activity or entity based on at least some of the one or more metrics and additional inputs to the machine learning model. For example, the machine learning model may predict whether a specific activity (e.g., a specific purchase transaction between a buyer and seller) is illicit. The specific purchase transaction may be associated with a seller corresponding to the starting node in the graph. The one or more metrics can include distance metrics (e.g., mean distance from the starting node to an illicit node corresponding to an illicit actor) that can aid the machine learning model in determining whether the specific purchase transaction is illicit. The machine learning model may also be tasked with directly determining whether a specific entity is illicit based on distance metrics corresponding to how closely the specific entity is to illicit actors (in terms of node distances in a graph in which the specific entity corresponds to the starting node). -
FIG. 5 is a flow diagram illustrating an embodiment of a process for handling unlabeled nodes in a graph. In some embodiments, the process ofFIG. 5 is performed bygraph analysis component 300 ofFIG. 3 . In some embodiments, at least a portion of the process ofFIG. 5 is performed in 402 ofFIG. 4 . - At 502, a node is received. In some embodiments, the node is one of multiple nodes received by
random walker module 302 ofFIG. 3 . - At 504, it is determined whether the node is missing a label. If at 504 it is determined that the node is not labeled, at 506, a label for the node is generated using a machine learning model (e.g., a neural network model as discussed with respect to
labeling module 306 ofFIG. 3 ). It is also possible to use labeling approaches that do not rely on a machine learning model (e.g., rules-based approaches). If at 504 it is determined that the node is labeled, 506 is not performed. - At 508, it is determined whether there are more nodes to analyze. If at 508 it is determined that there are more nodes to analyze, at 502, another node is received in order to determine whether that node is labeled. If at 508 it is determined that there are no more nodes to analyze, then no further action is taken.
-
FIG. 6 is a flow diagram illustrating an embodiment of a process for automatically performing a traversal walk on a graph from a starting node. In various embodiments, the traversal walk performed is one of a plurality of traversal walks performed in order to generate a distribution of traversal walks. In some embodiments, the process ofFIG. 6 is performed byrandom walker module 302 ofFIG. 3 . In some embodiments, at least a portion of the process ofFIG. 6 is performed in 406 ofFIG. 4 . - At 602, a node is arrived at and a list of nodes is updated. The node may be a starting node of a sequence of nodes, an ending node in the sequence of nodes, or an intermediate node in a path of nodes from the starting node to the ending node. The list of nodes is a running sequence of nodes beginning with the starting node. The list of nodes is updated to include the node that is arrived at, and the list of nodes is continually updated until after the ending node is reached. The ending node is reached if at least one stopping criterion is met.
- At 604, it is determined whether at least one stopping criterion is met. An example of a stopping criterion is the node that is arrived at being an illicit node. Another example of a stopping criterion is the node that is arrived at being a terminal node (that is not an illicit node) that has no eligible neighboring nodes. If at 604 it is determined that no stopping criterion has been met, at 606, a neighboring node to traverse to is randomly selected. In various embodiments, the random selection is uniform in the sense that each eligible neighboring node has an equal probability of being selected. The neighboring node that is randomly selected then becomes the next arrival node at 602. The list of nodes is updated to reflect the addition of the randomly selected neighboring node to the sequence of nodes from the starting node to the ending node.
- If at 604 it is determined that a stopping criterion has been met, at 608, the list of nodes is provided. Meeting the stopping criterion causes the traversal walk to terminate. In some embodiments, the list of nodes, which describes the sequence of nodes visited in the traversal walk, is transmitted to feature
extraction module 304 ofFIG. 3 . -
FIG. 7 is a functional diagram illustrating a programmed computer system. In some embodiments, processes associated withframework 200 ofFIG. 2 and/orgraph analysis component 300 ofFIG. 3 are executed bycomputer system 700. In some embodiments, the processes ofFIGS. 4, 5 , and/or 6 are executed bycomputer system 700. - In the example shown,
computer system 700 includes various subsystems as described below.Computer system 700 includes at least one microprocessor subsystem (also referred to as a processor or a central processing unit (CPU)) 702.Computer system 700 can be physical or virtual (e.g., a virtual machine). For example,processor 702 can be implemented by a single-chip processor or by multiple processors. In some embodiments,processor 702 is a general-purpose digital processor that controls the operation ofcomputer system 700. Using instructions retrieved frommemory 710,processor 702 controls the reception and manipulation of input data, and the output and display of data on output devices (e.g., display 718). -
Processor 702 is coupled bi-directionally withmemory 710, which can include a first primary storage, typically a random-access memory (RAM), and a second primary storage area, typically a read-only memory (ROM). As is well known in the art, primary storage can be used as a general storage area and as scratch-pad memory, and can also be used to store input data and processed data. Primary storage can also store programming instructions and data, in the form of data objects and text objects, in addition to other data and instructions for processes operating onprocessor 702. Also, as is well known in the art, primary storage typically includes basic operating instructions, program code, data, and objects used byprocessor 702 to perform its functions (e.g., programmed instructions). For example,memory 710 can include any suitable computer-readable storage media, described below, depending on whether, for example, data access needs to be bi-directional or uni-directional. For example,processor 702 can also directly and very rapidly retrieve and store frequently needed data in a cache memory (not shown). - Persistent memory 712 (e.g., a removable mass storage device) provides additional data storage capacity for
computer system 700, and is coupled either bi-directionally (read/write) or uni-directionally (read only) toprocessor 702. For example,persistent memory 712 can also include computer-readable media such as magnetic tape, flash memory, PC-CARDS, portable mass storage devices, holographic storage devices, and other storage devices. A fixedmass storage 720 can also, for example, provide additional data storage capacity. The most common example of fixedmass storage 720 is a hard disk drive.Persistent memory 712 and fixedmass storage 720 generally store additional programming instructions, data, and the like that typically are not in active use by theprocessor 702. It will be appreciated that the information retained withinpersistent memory 712 and fixedmass storage 720 can be incorporated, if needed, in standard fashion as part of memory 710 (e.g., RAM) as virtual memory. - In addition to providing
processor 702 access to storage subsystems,bus 714 can also be used to provide access to other subsystems and devices. As shown, these can include adisplay monitor 718, anetwork interface 716, akeyboard 704, and apointing device 706, as well as an auxiliary input/output device interface, a sound card, speakers, and other subsystems as needed. For example, pointingdevice 706 can be a mouse, stylus, track ball, or tablet, and is useful for interacting with a graphical user interface. -
Network interface 716 allowsprocessor 702 to be coupled to another computer, computer network, or telecommunications network using a network connection as shown. For example, throughnetwork interface 716,processor 702 can receive information (e.g., data objects or program instructions) from another network or output information to another network in the course of performing method/process steps. Information, often represented as a sequence of instructions to be executed on a processor, can be received from and outputted to another network. An interface card or similar device and appropriate software implemented by (e.g., executed/performed on)processor 702 can be used to connectcomputer system 700 to an external network and transfer data according to standard protocols. Processes can be executed onprocessor 702, or can be performed across a network such as the Internet, intranet networks, or local area networks, in conjunction with a remote processor that shares a portion of the processing. Additional mass storage devices (not shown) can also be connected toprocessor 702 throughnetwork interface 716. - An auxiliary I/O device interface (not shown) can be used in conjunction with
computer system 700. The auxiliary I/O device interface can include general and customized interfaces that allowprocessor 702 to send and, more typically, receive data from other devices such as microphones, touch-sensitive displays, transducer card readers, tape readers, voice or handwriting recognizers, biometrics readers, cameras, portable mass storage devices, and other computers. - In addition, various embodiments disclosed herein further relate to computer storage products with a computer readable medium that includes program code for performing various computer-implemented operations. The computer-readable medium is any data storage device that can store data which can thereafter be read by a computer system. Examples of computer-readable media include, but are not limited to, all the media mentioned above: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as optical disks; and specially configured hardware devices such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs), and ROM and RAM devices. Examples of program code include both machine code, as produced, for example, by a compiler, or files containing higher level code (e.g., script) that can be executed using an interpreter.
- The computer system shown in
FIG. 7 is but an example of a computer system suitable for use with the various embodiments disclosed herein. Other computer systems suitable for such use can include additional or fewer subsystems. In addition,bus 714 is illustrative of any interconnection scheme serving to link the subsystems. Other computer architectures having different configurations of subsystems can also be utilized. - Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/553,265 US20220198471A1 (en) | 2020-12-18 | 2021-12-16 | Graph traversal for measurement of fraudulent nodes |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063127941P | 2020-12-18 | 2020-12-18 | |
PT117636 | 2021-12-14 | ||
PT11763621 | 2021-12-14 | ||
US17/553,265 US20220198471A1 (en) | 2020-12-18 | 2021-12-16 | Graph traversal for measurement of fraudulent nodes |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220198471A1 true US20220198471A1 (en) | 2022-06-23 |
Family
ID=81608086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/553,265 Pending US20220198471A1 (en) | 2020-12-18 | 2021-12-16 | Graph traversal for measurement of fraudulent nodes |
Country Status (2)
Country | Link |
---|---|
US (1) | US20220198471A1 (en) |
EP (1) | EP4016430A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200394707A1 (en) * | 2018-02-28 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for identifying online money-laundering customer groups |
US20220207536A1 (en) * | 2020-12-29 | 2022-06-30 | Visa International Service Association | System, Method, and Computer Program Product for Generating Synthetic Data |
US20220398264A1 (en) * | 2021-06-10 | 2022-12-15 | Jpmorgan Chase Bank, N.A. | Systems and methods for streaming classification of distributed ledger-based activities |
US20240078260A1 (en) * | 2022-09-02 | 2024-03-07 | Tsinghua University | Systems and methods for general-purpose out-of-core random walk graph computing |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116467468B (en) * | 2023-05-05 | 2024-01-05 | 国网浙江省电力有限公司 | Power management system abnormal information handling method based on knowledge graph technology |
Citations (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120059858A1 (en) * | 2010-09-03 | 2012-03-08 | Jackson Jr Robert Lewis | Minimal representation of connecting walks |
US20140143110A1 (en) * | 2012-11-20 | 2014-05-22 | Sap Ag | Circular Transaction Path Detection |
US20150134512A1 (en) * | 2013-11-13 | 2015-05-14 | Mastercard International Incorporated | System and method for detecting fraudulent network events |
US20150161622A1 (en) * | 2013-12-10 | 2015-06-11 | Florian Hoffmann | Fraud detection using network analysis |
US20150293994A1 (en) * | 2012-11-06 | 2015-10-15 | Hewlett-Packard Development Company, L.P. | Enhanced graph traversal |
US20150294028A1 (en) * | 2012-11-13 | 2015-10-15 | American Express Travel Related Services Company, Inc. | Systems and Methods for Dynamic Construction of Entity Graphs |
US20160171732A1 (en) * | 2014-10-14 | 2016-06-16 | International Business Machines Corporation | Visualization of relationships and strengths between data nodes |
US20170053294A1 (en) * | 2015-08-18 | 2017-02-23 | Mastercard International Incorporated | Systems and methods for generating relationships via a property graph model |
US20170140382A1 (en) * | 2015-11-12 | 2017-05-18 | International Business Machines Corporation | Identifying transactional fraud utilizing transaction payment relationship graph link prediction |
US20170169174A1 (en) * | 2015-12-10 | 2017-06-15 | Ayasdi, Inc. | Detection of fraud or abuse |
US20170178139A1 (en) * | 2015-12-18 | 2017-06-22 | Aci Worldwide Corp. | Analysis of Transaction Information Using Graphs |
US20170178136A1 (en) * | 2015-12-16 | 2017-06-22 | Mastercard International Incorporated | Systems and methods for identifying suspect illicit merchants |
US20180196694A1 (en) * | 2017-01-11 | 2018-07-12 | The Western Union Company | Transaction analyzer using graph-oriented data structures |
US20180330258A1 (en) * | 2017-05-09 | 2018-11-15 | Theodore D. Harris | Autonomous learning platform for novel feature discovery |
US20190164173A1 (en) * | 2017-11-28 | 2019-05-30 | Equifax Inc. | Synthetic online entity detection |
US20190266528A1 (en) * | 2018-02-25 | 2019-08-29 | Graphen, Inc. | System for Discovering Hidden Correlation Relationships for Risk Analysis Using Graph-Based Machine Learning |
US20190378050A1 (en) * | 2018-06-12 | 2019-12-12 | Bank Of America Corporation | Machine learning system to identify and optimize features based on historical data, known patterns, or emerging patterns |
US20190377819A1 (en) * | 2018-06-12 | 2019-12-12 | Bank Of America Corporation | Machine learning system to detect, label, and spread heat in a graph structure |
US10515366B1 (en) * | 2013-12-24 | 2019-12-24 | EMC IP Holding Company LLC | Network neighborhood topology as a predictor for fraud and anomaly detection |
US20200036821A1 (en) * | 2018-07-27 | 2020-01-30 | BEIJING XIAOMI MOBILE SOFTWARE CO., LTD., Beijing, CHINA | Front case of electronic device, and electronic device |
US20200110761A1 (en) * | 2018-10-04 | 2020-04-09 | Graphen, Inc. | System and method for providing an artificially-intelligent graph database |
US20200142957A1 (en) * | 2018-11-02 | 2020-05-07 | Oracle International Corporation | Learning property graph representations edge-by-edge |
US20200160121A1 (en) * | 2018-11-16 | 2020-05-21 | Refinitiv Us Organization Llc | Systems and method for scoring entities and networks in a knowledge graph |
US20200210833A1 (en) * | 2018-12-28 | 2020-07-02 | Visa International Service Association | Method, System, and Computer Program Product for Determining Relationships of Entities Associated with Interactions |
US20200226512A1 (en) * | 2017-09-22 | 2020-07-16 | 1Nteger, Llc | Systems and methods for investigating and evaluating financial crime and sanctions-related risks |
US10740399B1 (en) * | 2017-11-10 | 2020-08-11 | Pinterest, Inc. | Node graph traversal methods |
US10778706B1 (en) * | 2020-01-10 | 2020-09-15 | Capital One Services, Llc | Fraud detection using graph databases |
US20200320534A1 (en) * | 2019-04-04 | 2020-10-08 | Paypal, Inc. | Systems and methods for using machine learning to predict events associated with transactions |
US20200349586A1 (en) * | 2019-04-30 | 2020-11-05 | Paypal, Inc. | Detecting fraud using machine-learning |
US20200364366A1 (en) * | 2019-05-15 | 2020-11-19 | International Business Machines Corporation | Deep learning-based identity fraud detection |
US20200380376A1 (en) * | 2019-05-28 | 2020-12-03 | Accenture Global Solutions Limited | Artificial Intelligence Based System And Method For Predicting And Preventing Illicit Behavior |
US20210012346A1 (en) * | 2019-07-10 | 2021-01-14 | Capital One Services, Llc | Relation-based systems and methods for fraud detection and evaluation |
US20210014124A1 (en) * | 2019-07-10 | 2021-01-14 | Adobe Inc. | Feature-based network embedding |
US20210019762A1 (en) * | 2019-07-19 | 2021-01-21 | Intuit Inc. | Identity resolution for fraud ring detection |
US20210027145A1 (en) * | 2018-03-22 | 2021-01-28 | China Unionpay Co., Ltd. | Fraudulent transaction detection method based on sequence wide and deep learning |
US20210049171A1 (en) * | 2019-08-16 | 2021-02-18 | Oracle International Corporation | Efficient sql-based graph random walk |
US20210049225A1 (en) * | 2019-08-15 | 2021-02-18 | Advanced New Technologies Co., Ltd. | Method and apparatus for processing user interaction sequence data |
US20210067549A1 (en) * | 2019-08-29 | 2021-03-04 | Nec Laboratories America, Inc. | Anomaly detection with graph adversarial training in computer systems |
US20210065245A1 (en) * | 2019-08-30 | 2021-03-04 | Intuit Inc. | Using machine learning to discern relationships between individuals from digital transactional data |
US20210158161A1 (en) * | 2019-11-22 | 2021-05-27 | Fraud.net, Inc. | Methods and Systems for Detecting Spurious Data Patterns |
US20210176262A1 (en) * | 2018-05-02 | 2021-06-10 | Visa International Service Association | Event monitoring and response system and method |
US20210209604A1 (en) * | 2020-01-06 | 2021-07-08 | Visa International Service Association | Method, System, and Computer Program Product for Detecting Group Activities in a Network |
US20210233080A1 (en) * | 2020-01-24 | 2021-07-29 | Adobe Inc. | Utilizing a time-dependent graph convolutional neural network for fraudulent transaction identification |
US20210303783A1 (en) * | 2020-03-31 | 2021-09-30 | Capital One Services, Llc | Multi-layer graph-based categorization |
US20210311952A1 (en) * | 2020-04-02 | 2021-10-07 | Capital One Services, Llc | Computer-based systems for dynamic network graph generation based on automated entity and/or activity resolution and methods of use thereof |
US20210334822A1 (en) * | 2020-04-22 | 2021-10-28 | Actimize Ltd. | Systems and methods for detecting unauthorized or suspicious financial activity |
US20210334811A1 (en) * | 2018-12-21 | 2021-10-28 | Paypal, Inc. | System and Method for Fraudulent Scheme Detection using Time-Evolving Graphs |
US20210374754A1 (en) * | 2020-05-28 | 2021-12-02 | Paypal, Inc. | Risk assessment through device data using machine learning-based network |
US20220020026A1 (en) * | 2020-07-17 | 2022-01-20 | Mastercard International Incorporated | Anti-money laundering methods and systems for predicting suspicious transactions using artifical intelligence |
US20220101327A1 (en) * | 2020-09-29 | 2022-03-31 | Mastercard International Incorporated | Method and system for detecting fraudulent transactions |
US20220121891A1 (en) * | 2020-10-19 | 2022-04-21 | Fujitsu Limited | Labeling and data augmentation for graph data |
US20220129871A1 (en) * | 2016-04-20 | 2022-04-28 | Wells Fargo Bank, N.A. | System for mapping user trust relationships |
US20220172211A1 (en) * | 2020-11-30 | 2022-06-02 | International Business Machines Corporation | Applying machine learning to learn relationship weightage in risk networks |
US20220188837A1 (en) * | 2020-12-10 | 2022-06-16 | Jpmorgan Chase Bank, N.A. | Systems and methods for multi-agent based fraud detection |
US20220247662A1 (en) * | 2021-01-29 | 2022-08-04 | Paypal, Inc. | Graph-Based Node Classification Based on Connectivity and Topology |
US20220300903A1 (en) * | 2021-03-19 | 2022-09-22 | The Toronto-Dominion Bank | System and method for dynamically predicting fraud using machine learning |
US11593622B1 (en) * | 2020-02-14 | 2023-02-28 | Amazon Technologies, Inc. | Artificial intelligence system employing graph convolutional networks for analyzing multi-entity-type multi-relational data |
US11640609B1 (en) * | 2019-12-13 | 2023-05-02 | Wells Fargo Bank, N.A. | Network based features for financial crime detection |
US11704673B1 (en) * | 2020-06-29 | 2023-07-18 | Stripe, Inc. | Systems and methods for identity graph based fraud detection |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6285999B1 (en) * | 1997-01-10 | 2001-09-04 | The Board Of Trustees Of The Leland Stanford Junior University | Method for node ranking in a linked database |
-
2021
- 2021-12-16 US US17/553,265 patent/US20220198471A1/en active Pending
- 2021-12-20 EP EP21215898.4A patent/EP4016430A1/en active Pending
Patent Citations (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120059858A1 (en) * | 2010-09-03 | 2012-03-08 | Jackson Jr Robert Lewis | Minimal representation of connecting walks |
US20150293994A1 (en) * | 2012-11-06 | 2015-10-15 | Hewlett-Packard Development Company, L.P. | Enhanced graph traversal |
US20150294028A1 (en) * | 2012-11-13 | 2015-10-15 | American Express Travel Related Services Company, Inc. | Systems and Methods for Dynamic Construction of Entity Graphs |
US20140143110A1 (en) * | 2012-11-20 | 2014-05-22 | Sap Ag | Circular Transaction Path Detection |
US20150134512A1 (en) * | 2013-11-13 | 2015-05-14 | Mastercard International Incorporated | System and method for detecting fraudulent network events |
US20150161622A1 (en) * | 2013-12-10 | 2015-06-11 | Florian Hoffmann | Fraud detection using network analysis |
US10515366B1 (en) * | 2013-12-24 | 2019-12-24 | EMC IP Holding Company LLC | Network neighborhood topology as a predictor for fraud and anomaly detection |
US20160171732A1 (en) * | 2014-10-14 | 2016-06-16 | International Business Machines Corporation | Visualization of relationships and strengths between data nodes |
US20170053294A1 (en) * | 2015-08-18 | 2017-02-23 | Mastercard International Incorporated | Systems and methods for generating relationships via a property graph model |
US20170140382A1 (en) * | 2015-11-12 | 2017-05-18 | International Business Machines Corporation | Identifying transactional fraud utilizing transaction payment relationship graph link prediction |
US20170169174A1 (en) * | 2015-12-10 | 2017-06-15 | Ayasdi, Inc. | Detection of fraud or abuse |
US20170178136A1 (en) * | 2015-12-16 | 2017-06-22 | Mastercard International Incorporated | Systems and methods for identifying suspect illicit merchants |
US20170178139A1 (en) * | 2015-12-18 | 2017-06-22 | Aci Worldwide Corp. | Analysis of Transaction Information Using Graphs |
US20220129871A1 (en) * | 2016-04-20 | 2022-04-28 | Wells Fargo Bank, N.A. | System for mapping user trust relationships |
US20180196694A1 (en) * | 2017-01-11 | 2018-07-12 | The Western Union Company | Transaction analyzer using graph-oriented data structures |
US20180330258A1 (en) * | 2017-05-09 | 2018-11-15 | Theodore D. Harris | Autonomous learning platform for novel feature discovery |
US20200226512A1 (en) * | 2017-09-22 | 2020-07-16 | 1Nteger, Llc | Systems and methods for investigating and evaluating financial crime and sanctions-related risks |
US10740399B1 (en) * | 2017-11-10 | 2020-08-11 | Pinterest, Inc. | Node graph traversal methods |
US20190164173A1 (en) * | 2017-11-28 | 2019-05-30 | Equifax Inc. | Synthetic online entity detection |
US20190266528A1 (en) * | 2018-02-25 | 2019-08-29 | Graphen, Inc. | System for Discovering Hidden Correlation Relationships for Risk Analysis Using Graph-Based Machine Learning |
US20210027145A1 (en) * | 2018-03-22 | 2021-01-28 | China Unionpay Co., Ltd. | Fraudulent transaction detection method based on sequence wide and deep learning |
US20210176262A1 (en) * | 2018-05-02 | 2021-06-10 | Visa International Service Association | Event monitoring and response system and method |
US20190378050A1 (en) * | 2018-06-12 | 2019-12-12 | Bank Of America Corporation | Machine learning system to identify and optimize features based on historical data, known patterns, or emerging patterns |
US20190377819A1 (en) * | 2018-06-12 | 2019-12-12 | Bank Of America Corporation | Machine learning system to detect, label, and spread heat in a graph structure |
US20200036821A1 (en) * | 2018-07-27 | 2020-01-30 | BEIJING XIAOMI MOBILE SOFTWARE CO., LTD., Beijing, CHINA | Front case of electronic device, and electronic device |
US20200110761A1 (en) * | 2018-10-04 | 2020-04-09 | Graphen, Inc. | System and method for providing an artificially-intelligent graph database |
US20200142957A1 (en) * | 2018-11-02 | 2020-05-07 | Oracle International Corporation | Learning property graph representations edge-by-edge |
US20200160121A1 (en) * | 2018-11-16 | 2020-05-21 | Refinitiv Us Organization Llc | Systems and method for scoring entities and networks in a knowledge graph |
US20210334811A1 (en) * | 2018-12-21 | 2021-10-28 | Paypal, Inc. | System and Method for Fraudulent Scheme Detection using Time-Evolving Graphs |
US20200210833A1 (en) * | 2018-12-28 | 2020-07-02 | Visa International Service Association | Method, System, and Computer Program Product for Determining Relationships of Entities Associated with Interactions |
US20200320534A1 (en) * | 2019-04-04 | 2020-10-08 | Paypal, Inc. | Systems and methods for using machine learning to predict events associated with transactions |
US20200349586A1 (en) * | 2019-04-30 | 2020-11-05 | Paypal, Inc. | Detecting fraud using machine-learning |
US20200364366A1 (en) * | 2019-05-15 | 2020-11-19 | International Business Machines Corporation | Deep learning-based identity fraud detection |
US20200380376A1 (en) * | 2019-05-28 | 2020-12-03 | Accenture Global Solutions Limited | Artificial Intelligence Based System And Method For Predicting And Preventing Illicit Behavior |
US20210014124A1 (en) * | 2019-07-10 | 2021-01-14 | Adobe Inc. | Feature-based network embedding |
US20210012346A1 (en) * | 2019-07-10 | 2021-01-14 | Capital One Services, Llc | Relation-based systems and methods for fraud detection and evaluation |
US20210019762A1 (en) * | 2019-07-19 | 2021-01-21 | Intuit Inc. | Identity resolution for fraud ring detection |
US20210049225A1 (en) * | 2019-08-15 | 2021-02-18 | Advanced New Technologies Co., Ltd. | Method and apparatus for processing user interaction sequence data |
US20210049171A1 (en) * | 2019-08-16 | 2021-02-18 | Oracle International Corporation | Efficient sql-based graph random walk |
US20210067549A1 (en) * | 2019-08-29 | 2021-03-04 | Nec Laboratories America, Inc. | Anomaly detection with graph adversarial training in computer systems |
US20210065245A1 (en) * | 2019-08-30 | 2021-03-04 | Intuit Inc. | Using machine learning to discern relationships between individuals from digital transactional data |
US20210158161A1 (en) * | 2019-11-22 | 2021-05-27 | Fraud.net, Inc. | Methods and Systems for Detecting Spurious Data Patterns |
US11640609B1 (en) * | 2019-12-13 | 2023-05-02 | Wells Fargo Bank, N.A. | Network based features for financial crime detection |
US20210209604A1 (en) * | 2020-01-06 | 2021-07-08 | Visa International Service Association | Method, System, and Computer Program Product for Detecting Group Activities in a Network |
US10778706B1 (en) * | 2020-01-10 | 2020-09-15 | Capital One Services, Llc | Fraud detection using graph databases |
US20210233080A1 (en) * | 2020-01-24 | 2021-07-29 | Adobe Inc. | Utilizing a time-dependent graph convolutional neural network for fraudulent transaction identification |
US11593622B1 (en) * | 2020-02-14 | 2023-02-28 | Amazon Technologies, Inc. | Artificial intelligence system employing graph convolutional networks for analyzing multi-entity-type multi-relational data |
US20210303783A1 (en) * | 2020-03-31 | 2021-09-30 | Capital One Services, Llc | Multi-layer graph-based categorization |
US20210311952A1 (en) * | 2020-04-02 | 2021-10-07 | Capital One Services, Llc | Computer-based systems for dynamic network graph generation based on automated entity and/or activity resolution and methods of use thereof |
US20210334822A1 (en) * | 2020-04-22 | 2021-10-28 | Actimize Ltd. | Systems and methods for detecting unauthorized or suspicious financial activity |
US20210374754A1 (en) * | 2020-05-28 | 2021-12-02 | Paypal, Inc. | Risk assessment through device data using machine learning-based network |
US11704673B1 (en) * | 2020-06-29 | 2023-07-18 | Stripe, Inc. | Systems and methods for identity graph based fraud detection |
US20220020026A1 (en) * | 2020-07-17 | 2022-01-20 | Mastercard International Incorporated | Anti-money laundering methods and systems for predicting suspicious transactions using artifical intelligence |
US20220101327A1 (en) * | 2020-09-29 | 2022-03-31 | Mastercard International Incorporated | Method and system for detecting fraudulent transactions |
US20220121891A1 (en) * | 2020-10-19 | 2022-04-21 | Fujitsu Limited | Labeling and data augmentation for graph data |
US20220172211A1 (en) * | 2020-11-30 | 2022-06-02 | International Business Machines Corporation | Applying machine learning to learn relationship weightage in risk networks |
US20220188837A1 (en) * | 2020-12-10 | 2022-06-16 | Jpmorgan Chase Bank, N.A. | Systems and methods for multi-agent based fraud detection |
US20220247662A1 (en) * | 2021-01-29 | 2022-08-04 | Paypal, Inc. | Graph-Based Node Classification Based on Connectivity and Topology |
US20220300903A1 (en) * | 2021-03-19 | 2022-09-22 | The Toronto-Dominion Bank | System and method for dynamically predicting fraud using machine learning |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200394707A1 (en) * | 2018-02-28 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for identifying online money-laundering customer groups |
US20220207536A1 (en) * | 2020-12-29 | 2022-06-30 | Visa International Service Association | System, Method, and Computer Program Product for Generating Synthetic Data |
US11640610B2 (en) * | 2020-12-29 | 2023-05-02 | Visa International Service Association | System, method, and computer program product for generating synthetic data |
US20220398264A1 (en) * | 2021-06-10 | 2022-12-15 | Jpmorgan Chase Bank, N.A. | Systems and methods for streaming classification of distributed ledger-based activities |
US11853278B2 (en) | 2021-06-10 | 2023-12-26 | Jpmorgan Chase Bank , N.A. | Systems and methods for combining graph embedding and random forest classification for improving classification of distributed ledger activities |
US12050573B2 (en) * | 2021-06-10 | 2024-07-30 | Jpmorgan Chase Bank , N.A. | Systems and methods for streaming classification of distributed ledger-based activities |
US20240078260A1 (en) * | 2022-09-02 | 2024-03-07 | Tsinghua University | Systems and methods for general-purpose out-of-core random walk graph computing |
US12013897B2 (en) * | 2022-09-02 | 2024-06-18 | Tsinghua University | Systems and methods for general-purpose out-of-core random walk graph computing |
Also Published As
Publication number | Publication date |
---|---|
EP4016430A1 (en) | 2022-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220198471A1 (en) | Graph traversal for measurement of fraudulent nodes | |
US11729194B2 (en) | Automatic model monitoring for data streams | |
US20240303468A1 (en) | Interleaved sequence recurrent neural networks for fraud detection | |
US12001800B2 (en) | Semantic-aware feature engineering | |
US8355896B2 (en) | Co-occurrence consistency analysis method and apparatus for finding predictive variable groups | |
US11250444B2 (en) | Identifying and labeling fraudulent store return activities | |
US20190236608A1 (en) | Transaction Aggregation and Multi-attribute Scoring System | |
Savage et al. | Detection of money laundering groups: Supervised learning on small networks | |
US11620653B2 (en) | Systems and methods for configuring and implementing a malicious account testing machine learning model in a machine learning-based digital threat mitigation platform | |
Iyer et al. | Credit card fraud detection using hidden markov model | |
KR102259838B1 (en) | Apparatus and method for building a blacklist of cryptocurrencies | |
Jonnalagadda et al. | Credit card fraud detection using Random Forest Algorithm | |
EP4060563A1 (en) | Automatic profile extraction in data streams using recurrent neural networks | |
CN115545886A (en) | Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium | |
Cochrane et al. | Pattern analysis for transaction fraud detection | |
US11916927B2 (en) | Systems and methods for accelerating a disposition of digital dispute events in a machine learning-based digital threat mitigation platform | |
Marmo | Data mining for fraud detection | |
CN111652718A (en) | Method, device, equipment and medium for monitoring value flow direction based on relational network diagram | |
US12143402B2 (en) | Systems and methods for accelerating a disposition of digital dispute events in a machine learning-based digital threat mitigation platform | |
US20200151596A1 (en) | Entity recognition system based on interaction vectorization | |
Farhan | Credit Fraud Recognition Based on Performance Evaluation of Deep Learning Algorithms | |
US20220201035A1 (en) | Apparatus, method and computer program product for identifying a set of messages of interest in a network | |
BETT KIPKIRUI | Enhancement of Credit Score Prediction for imbalanced Datasets Using Data Mining Approaches (Case study: Bank of Kigali) | |
CN117764589A (en) | Risk prevention and control method, device, equipment and medium | |
CN118505230A (en) | Training method and device for detection model, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: FEEDZAI - CONSULTADORIA E INOVACAO TECNOLOGICA, S.A., PORTUGAL Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SILVA, MARIA INES;APARICIO, DAVID OLIVEIRA;EDDIN, AHMAD NASER;AND OTHERS;SIGNING DATES FROM 20220225 TO 20220301;REEL/FRAME:059577/0496 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |