US20190140835A1 - Blind Hash Compression - Google Patents
Blind Hash Compression Download PDFInfo
- Publication number
- US20190140835A1 US20190140835A1 US16/236,566 US201816236566A US2019140835A1 US 20190140835 A1 US20190140835 A1 US 20190140835A1 US 201816236566 A US201816236566 A US 201816236566A US 2019140835 A1 US2019140835 A1 US 2019140835A1
- Authority
- US
- United States
- Prior art keywords
- data
- computing devices
- code
- server system
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3236—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0876—Network architectures or network communication protocols for network security for authentication of entities based on the identity of the terminal or configuration, e.g. MAC address, hardware or software configuration or device fingerprint
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/06—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
- H04L9/0643—Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/16—Obfuscation or hiding, e.g. involving white box
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/30—Compression, e.g. Merkle-Damgard construction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3236—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
- H04L9/3239—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
Definitions
- This document generally relates to computer communications.
- Web content such as HTML or JavaScript for generating web pages
- Web content may contain application-like functionality that is interpreted and executed within a visitor's browser, or in a similar application.
- the general goal with HTML and other web technologies is to make them work, and work similarly, across many different platforms (e.g., Mac, PC, Linux, etc.).
- This document describes systems and techniques by which various user computing devices (computers such as desktops, laptops, tablets, and smartphones) can submit information to a server system in a manner that lowers the bandwidth required for such reporting.
- certain of the computing devices can send information in a lossy compressed format (e.g., as a hash of the original information), while others can send the same information in an uncompressing format (e.g., as the original plaintext).
- the compressed format may be highly compressed, such as by a lossy one-way function so that the server system cannot immediately determine what original string a compressed submission is indicative of (e.g., via a hash function or other lossy compression function).
- the server system compresses any received uncompressed submissions (or submitted with lossless compression) using the same technique used by the client devices to perform their compression, at which point the server system knows the correlation between the uncompressed and compressed representations, and can then correlate any previously- or later-received compressed representations back to the original raw data.
- the percentage of the client computers reporting raw data may be much smaller than those reporting compressed data, so that the overall bandwidth of the system is substantially reduced.
- each of the computing devices may determine whether it should submit a compressed representation of the data, or instead, an uncompressed representation by generating a random number (again, e.g., using standard JavaScript functions), and only send a particular format or representation if the generated number is above or below a predetermined number, as the case may be.
- a random number e.g., using standard JavaScript functions
- the server system may provide a biasing value to the computing devices when it serves web code so as to push the random number higher or lower, so as to affect the likelihood that any particular computing device will send uncompressed, raw data instead of compressed data. More frequent submission of uncompressed representations will allow a server system to more quickly identify the real meaning of data that newly arrives, e.g., when new features arrive on the computing devices (e.g., new plug ins are announced), but could cause higher bandwidth usage in a pool of computing devices. Thus, an operator of a server system may use the biasing value to match its desire for fast reaction versus its desire for lower bandwidth requirements.
- the compression algorithm may be one that is available from public libraries, such as standard JavaScript hash algorithms.
- the server system may automatically obtain plaintext representations of new data as it arrives in a pool of computers (e.g., all computers trying to access a particular retailer's web site), but may also determine how broadly such information has spread without having to send the potentially voluminous plaintext representation for very many of the computing devices.
- hashing algorithms are selective enough that very few collisions will be seen between hashes (i.e., two different strings of text sent by computing devices will seldom generate the same hash value).
- a server system will not be able to determine what is meant by such a compressed value when it arrives (it will be ambiguous as between the two or more source strings that generate the compressed value).
- the system just discussed may also include provisions for resolving such collisions.
- a computing device may perform a secondary compression that uses a different algorithm than the primary compression, so that if the values of both compressions do not match across different submissions, then the source text for those different submissions is known to be different.
- a length of the source string may also be submitted as to serve as yet another separate check on the source string.
- the collected data may be configuration data for the computing devices, which may include, for example, the make and model of the computer, the make and version of the operating system and the web browser that is being used, the identity of active plug ins and other applications currently executing on the computing device in addition to the browser, among other things, such as installed fonts, screen resolution, etc.
- Collected data may also include activity data that identifies actions that have been taken on the computer, including actions by third-party software that appears to be anomalous (e.g., attempts to interact with the revised web code in an invalid manner).
- Such data may be collected by one or more central server systems for diagnostics purposes, including for identifying the state of machines when a program throws an error, and for identifying common characteristics of computing devices that are exhibiting fraudulent or other anomalous behavior.
- a criminal group may have a plug in or other software surreptitiously distributed to thousands of computers spread across the world to form a so-called bot net, and the server system discussed here may use reporting information from such computers to more quickly and accurately identify the presence of a new bot net that is emerging, and the behavior of that bot net (e.g., if common reports of malicious activity are coming from a particular operating system running a particular browser version).
- a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting parameters of the computing devices; receiving from different ones of the computing devices, a plaintext representation of a particular parameter of a first of the computing devices, and a hashed representation of the same parameter of a second of the computing devices; hashing the plaintext representation of the particular parameter to create a hash value, and comparing the hash value to the hashed representation; and based on a determination that the hash value matches the hashed representation, correlating the hashed representation to the plaintext representation on the computer server system, wherein the code for reporting parameters of the computing devices includes code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation.
- the code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation can include biasing data that affects a frequency with which the computing devices select to send the plaintext representation or the hashed representation.
- the method can further include receiving from the computing devices, plaintext representations and hashed representations of a plurality of different parameters of the computing devices; hashing the received plaintext representations to created hashed values; and using correlations between the hashed values and the received plaintext representations to identify parameters represented by the hashed representations.
- the method can further include using the hashed representation and the plaintext representation to identify characteristics of malware executing on the computing devices.
- a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting status of the computing devices; receiving from one or more of the computing devices, first data that indicates a parameter of the one or more computing devices, the first data in a compressed format; receiving from one or more others of the computing devices, second data that indicates the parameter of the one or more others of the computing devices, the second data in an uncompressed format; and compressing the second data and comparing the compressed second data to the first data to correlate the first data to the second data, wherein the code for reporting status of the computing devices includes code for allowing the computing devices to determine whether to send the first data or the second data.
- the code for allowing the computing devices to determine whether to send the first data or the second data can include biasing data that affects a frequency with which the computing devices select to send the first data or the second data.
- the first data can be compressed on the computing devices using hashing.
- the server system can be configured to not send hashing algorithm information to the computing devices.
- the method can further include using the compressed format to represent the parameter in identifying aggregate activity by multiple of the computing devices.
- the method can further include determining from the aggregate activity by multiple of the computer devices whether ones of the multiple computing devices is infected with malware.
- the computer server system can be an intermediary security server system that is separate from a web server system that generates and serves the web code.
- the method can further include comparing information sent with the compressed second data to information derived from the received first data to determine whether the compressed second data was generated from data that matches the first data.
- one or more non-transitory storage devices can store instructions that, when executed by one or more computer processors, perform operations comprising: serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting status of the computing devices; receiving from one or more of the computing devices, first data that indicates a parameter of the one or more computing devices, the first data in a compressed format; receiving from one or more others of the computing devices, second data that indicates the parameter of the one or more others of the computing devices, the second data in an uncompressed format; and compressing the second data and comparing the compressed second data to the first data to correlate the first data to the second data, wherein the code for reporting status of the computing devices includes code for allowing the computing devices to determine whether to send the first data or the second data.
- the code for allowing the computing devices to determine whether to send the first data or the second data can include biasing data that affects a frequency with which the computing devices select to send the first data or the second data.
- the first data can be compressed on the computing devices using hashing.
- the operations can further include using the compressed format to represent the parameter in identifying aggregate activity by multiple of the computing devices.
- the operations can further include determining from the aggregate activity by multiple of the computer devices whether ones of the multiple computing devices is infected with malware.
- the computer server system can include an intermediary security server system that is separate from a web server system that generates and serves the web code.
- the operations can further include comparing information sent with the compressed second data to information derived from the received first data to determine whether the compressed second data was generated from data that matches the first data.
- a computer-implemented system includes: a first data communication interface arranged to communicate with a web server system; a second data communication interface arranged to communicate with clients that request content from the web server system; a compressed code interpreter programmed to identify an original form of compressed content received from particular ones of the clients by (a) compressing original content received from other ones of the clients to form a compressed representation, and (b) comparing the compressed representation to the compressed content received from the particular ones of the clients, wherein compressed code interpreter compresses the original content using a technique that matches techniques used by the particular ones of the clients to compress the content.
- the system can be further programmed to provide code to the clients that allows the clients to determine whether to provide compressed content or instead, uncompressed content to the system.
- a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting parameters of the computing devices; receiving from different ones of the computing devices, a plaintext representation of a particular parameter of a first of the computing devices, and a hashed representation of the same parameter of a second of the computing devices; hashing the plaintext representation of the particular parameter to create a hash value, and comparing the hash value to the hashed representation; and based on a determination that the hash value matches the hashed representation, correlating the hashed representation to the plaintext representation on the computer server system, wherein the code for reporting parameters of the computing devices includes code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation.
- a security intermediary system may be provided that does not add an appreciable level of bandwidth to the communication channel between a server system and the clients it services.
- the intermediary system may collect data that is relatively large compared to the bandwidth that it occupies, and may use that data for diagnosing problems with particular clients, and across large numbers of clients (e.g., by identifying the spread of malware threats).
- a wide variety of data for various purposes may be transmitted using these techniques, and may be used for a wide variety of purposes once it is interpreted at the server system.
- the compressed representations can be used as database keys, thus further simplifying the operations recited herein.
- FIG. 1 is a schematic diagram of a system for providing compressed reporting of computing device information using a blind hash.
- FIG. 2 is a schematic diagram of a system for performing deflection and detection of malicious activity with respect to a web server system.
- FIG. 3 is a flow chart of a process for reducing bandwidth requirements between computers.
- FIG. 4 is a swim lane diagram of a process for transferring data between client computers and a server system.
- FIG. 5A is a representation of a state machine for client-side encoding.
- FIG. 5B is a representation of a state machine for server-side decoding.
- FIG. 6 is a block diagram of a generic computer system for implementing the processes and systems described herein.
- the term “or” may be inclusive or exclusive unless expressly stated otherwise; the term “set” may comprise zero, one, or two or more elements; the terms “first”, “second”, “certain”, and “particular” are used as naming conventions to distinguish elements from each other does not imply an ordering, timing, or any other characteristic of the referenced items unless otherwise specified; the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items; that the terms “comprises” and/or “comprising” specify the presence of stated features, but do not preclude the presence or addition of one or more other features.
- clients and servers are terms used generally, and do not require any sort of formal client-server architecture.
- the mechanisms are most useful where many different computing devices will be communicating the same data to the server system. For example, it may be beneficial to have computing devices report their configuration information to a server system so that the system can identify commonality in the operations of such devices, for example, to diagnose reasons for faults in the devices or to identify the emergence of malware on the devices in a large group of devices (e.g., all devices that access a banking or retail web site).
- the common data that is communicated may be communicated by some of the computing devices in its native form (e.g., plaintext) or another form in which its content can be directly determined (e.g., via lossless compression or encryption for which the server system receiving the data can accurately decompress or decrypt the data).
- its native form e.g., plaintext
- another form in which its content can be directly determined e.g., via lossless compression or encryption for which the server system receiving the data can accurately decompress or decrypt the data.
- Others of the devices may communicate the same data in a form from which it cannot be identified directly, such as by submitted a hash of the data.
- the server system receives compressed representations of the text but has not yet received the original representation, it can save indications of the compressed representations in association with the computing devices from which they were received, without knowing the original representation.
- the server system receives any uncompressed representations, it can compress them using the same algorithm that the client devices used, can store the correlation of the compressed representation to the original representation, and can use that correlation to resolve any compressed representations, whether associated with events reported from computing devices in the past or the future, to determine what the compressed representation actually represents.
- Some or all compressed representations may be accompanied by a secondary representation, that can be used to identify potential collisions between the compressed representation.
- a secondary representation that can be used to identify potential collisions between the compressed representation.
- certain compressed representations will end up being repeated in a system—so that two identical compressed representations received by a server system could represent different original strings.
- the secondary representation may serve as a check on the main representation, as it will be extremely unlike that both would match even though the original text did not.
- Such secondary representation may be transmitted to the server system with the compressed representation, and may be formed, for example, by applying a second hash or other compression technique to the original text that uses a different algorithm, or by sending a value that represents a length of the original string.
- the compressed representations or other representations that correspond to the compressed representations may then be passed as identifiers for the original data to systems that can perform analysis using such data.
- client devices may pass reports that indicate anomalous activity, such as efforts by a browser plug-in to access served code using defunct function names or the like (e.g., in a system that uses a security intermediary to change the function names with each serving of the web code).
- a fraud detection system may perform clustering analysis on the reported features of such computing devices, and may use the compressed representations as identifiers for the various reported features in performing such analysis.
- the analysis may be used to identify that device having particular characteristics (e.g., IP address, operating system, and browser) that have reported the existence of anomalous behavior, which may in turn be used to determine whether the anomalous behavior is benign (e.g., from a plug in that users intentionally installed) or malicious (e.g., code performing a “Man in the Middle” attack on their devices).
- FIG. 1 is a schematic diagram of a system 100 for providing compressed reporting of computing device information using a blind hash.
- the system 100 is directed to presenting information from a web server system 108 to a variety of computing devices 114 A-C that are located remotely from the web server system 108 .
- Examples of operators of such a web server system 108 include on-line retailers and on-line banking systems, where the devices 114 A-C belong to people trying to buy products or perform on-line banking transactions.
- the web server system 108 is shown as a row of servers along with a separate row of servers for a security server system 106 , both in a single data center facility. Such arrangement is intended to indicate that, in one typical implementation, an operator of a web site may supplement its main server system 108 with a security server system 106 that it builds itself or that it acquires for a third party.
- the security server system 106 may physically and logically between the web server system 108 and the network, which may include internet 104 , and may intercept web code to be served to the various client devices 102 A-C.
- the system 100 operates by providing modified or recoded web code to the client computing device 102 , where the modifications are relative to a web page that would normally be served to the client computing device without additional security measures applied.
- Web code may include, for example, HTML, CSS, JavaScript, and other program code associated with the content or transmission of web resources such as a web page that may be presented at a client computing device 102 (e.g., via a web browser or a native application (non-browser)).
- the system 100 can detect and obstruct attempts by fraudsters and computer hackers to learn the structure of a website (e.g., the operational design of the pages for a site) and exploit security vulnerabilities in the client device 102 .
- malware may infect the client device 102 and gather sensitive information about a user of the device, or deceive a user into engaging in compromising activity such as divulging confidential information.
- Man-in-the-middle exploits are performed by one type of malware that is difficult to detect on a client device 102 , but can use security vulnerabilities at the client device 102 to engage in such malicious activity.
- Served code 110 shows an example of code that can be served to a requesting one of various of the computing devices 102 A-C after the request is provided to the web server system 108 , the content from the web server system 108 is intercepted or otherwise provided to the security server system 106 , and the code is changed and/or supplemented by the security server system 106 .
- Various portions of the served code 110 are shown schematically to actions that the security server system 106 can take with respect to the code.
- Code 110 A represents the original web code provided by the web server system 108 with certain modifications made to it.
- the security server system 106 may change the names of functions in essentially random ways every time a set of content for a web page is served, where the changes are made consistently across the served code so as not to break internal references between pieces of the code.
- references to a particular function may be made consistently across HTML, CSS, and JavaScript.
- the following strings indicate HTML before and after alteration using a random number for textual replacement:
- Such changes may be made so that malware on a client device that receives the code cannot easily identify the operational structure of the web site and/or automatically interact with the code so as to mislead a user into opening its security to the malware (e.g., for a Man in the Middle attack).
- the security server system 106 interferes with such attacks by malware.
- Instrumentation code 110 B is added to the code 110 A by the security server system 106 , and allows the system 100 to detect malware in addition to deflecting its efforts.
- the instrumentation code 110 B can execute in the background on the computing devices 102 A-C and can monitor how the code 110 A operates and how other code on the particular computing device 102 A-C interacts with the execution of code 110 A.
- the instrumentation code 110 B can monitor the DOM made from the code 110 A at different points in time and may report back to security server system 106 information that characterizes the current state of the DOM. Such information can be compared to information that indicates what the DOM should look like in order to determine whether other side is interfering with the execution of code 110 A.
- the instrumentation code can identify anomalous attempts by third-party code to interact with the operation of code 110 A, such as for calls made to code 110 A using “old” names for the code (e.g., names that were valid in a prior serving of the relevant web page but that are no longer relevant because security server system 106 is constantly changing the names so as to create a moving target for such third-party code to hit).
- “old” names for the code e.g., names that were valid in a prior serving of the relevant web page but that are no longer relevant because security server system 106 is constantly changing the names so as to create a moving target for such third-party code to hit.
- a user telemetry script 110 C is also provided to a requesting one of computing devices 102 A-C.
- the user telemetry script 100 C may include code for managing communications between the relevant client device and the security server system 106 . Such communications may include transmission of information identified by the instrumentation code 100 B described above, and other relevant information.
- the security server system 106 can be supplied additional information using the user telemetry script and after the code 110 A has been served, such as information that affects the manner in which the instrumentation code 110 B operates.
- the security server system 106 may receive a report from the user telemetry script 110 C that indicates that a third-party program is attempting to interact with the served code 110 A, and may respond so as to have the instrumentation code 110 B perform certain operations to better understand the nature of the interaction occurring on the computing device.
- a request frequency code 110 D may also be sent and may be as simple as a single number that biases the user telemetry script 110 C to return information to the security server system 106 in its original form, or instead in a compressed form.
- the request frequency code 110 D that is sent in this example is a value of 1000, which may have been selected by the security server system 106 for a range between 0 and 1024 in this example.
- the user telemetry script 110 C may be programmed to select a random number between 0 and 1024, and to return the original text rather than a compressed version of the original text when the randomly-selected number exceeds 1000.
- original text will be returned by only about 2% of all computing devices that are served code from the security server system 106 using this request frequency value. Others of the computing devices will return a compressed version of the text, such as a hash of the original text produced by the particular device.
- the particular client devices 102 A-C may render respective webpages and establish document object models that represent the served page, in a familiar manner. User interactions with the webpage and associated code may then begin.
- the instrumentation code 110 B and user telemetry script 110 C may execute to return information about the configuration of a particular computing device to the security server system 106 .
- the user telemetry skip script 110 C may return data that identifies the operating system of the particular computing device, the model of the particular computing device, the amount of RAM loaded on the computing device, other applications executing on the computing device, and similar information.
- such functionality may be provided using a browser plug in that is programmed to perform a check of the environment for the machine on which it is running.
- JavaScript or VBScript can permit that measurement of User Agent, other HTTP header information, indirect measurements of the JavaScript execution environment, Plugin information, fonts, and screen information.
- computing device 102 A returns the numeric pair 24.16.
- These numbers represent, respectively, a hash of a textual string that represents the name and model of the browser that is running on computing device 102 A.
- all three computing devices 102 A-C are running the “Chrome 2.3.21.04” browser release, as an example.
- Such information may be obtained by making a request that is to be responded to with the “user agent” string on the particular computing device, in a familiar manner.
- computing device 102 A delivered this compressed representation of the user agent string, because it generated a random number of 300, which is less than the request frequency number of 1000.
- computing device 102 B when computing device 102 B received the served code 110 , it generated a random number of 674, meaning that it too would send a compressed version of the user agent string, or 24.16.
- 24 has been selected as an example to represent a hash that may be created from such a string, and the number 16 represents the number of characters in that string.
- computing device 102 C selected a random number of one 1012, which is greater than the request frequency number of 1000.
- computing device 102 C will be one of the 2% of all devices that report back the original, uncompressed (unhashed) version of the user agent string.
- the user agent string for Firefox on an Ipad is “Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405.”
- a compressed representation that indicates a hash and a length might be of the form 4528.111. As can be appreciated, the bandwidth for the latter is much lower than for the former.
- operations of the security server system 106 performed in response to receiving the communications from devices 102 A-C are shown schematically as a two-column database entry below security server system 106 and Web server system 108 .
- the two columns are shown to indicate how a system may associate a compressed version of a string with the actual string itself.
- the database has been populated with the hash value of 24 upon receiving that hash value form device 102 A.
- the system 100 does not, at that point, know what the original string representation for that value is (assume that the system did not receive earlier communications regarding the user agent string from other device), but stores the hash value 24 in anticipation that will eventually be able to determine what the original, plaintext value is.
- the table has not changed because again, the security server system 106 received only the hash code, and not the original version of the user agent string.
- the system receives a string of original plaintext and, as shown by the arrow labeled with “hash,” the system performs a hashing function on that plaintext that is the same as a hashing function that the system 100 knew to be provided by the computing devices 102 A-C.
- each of the computing devices 102 A-C and the security server system 106 may be programmed to use the same hashing algorithm as the Java hash algorithm, which is well known and readily available on many computing platforms.
- the security server system 106 may search the table for a matching value, and when it finds such a matching value, it may determine that that matching hash value is what corresponds to the original text. It may then update the table to correlate the particular hash value with the particular original plaintext. Such a correlation is shown in the row of the table labeled with a 3 in a circle.
- the number 24 can be used throughout the system to represent the user agent string represented here (i.e., as a unique database index value).
- a cluster analysis system like that discussed with respect to FIG. 2 below may use the number 24 to represent such a feature instead of using the full string representation.
- yet a third representation for the feature may be used as an index representation.
- the processing of the communication from computing device 102 C may also be accompanied by a determination that the full string is 16 characters in length.
- a value may be stored in yet a third column of the table (not shown) and may be correlated to the hash value and the original plaintext of the string.
- later communications arrive with a hash value of 24, they may be compared to the first column shown in the table, and their accompanying value of 16 may be compared to this additional value to provide more confidence that the hash value is unique to this particular original textual string.
- other techniques may also be used to ensure that there are no collisions in the hash values, such as by returning an additional number or other representation that is generated by an alternative hash algorithm.
- the security server system 106 may provide a special message to the responding computing device to trigger the responding computer device to transmit the original plaintext code instead of the hash value.
- Other tables may store additional relationships that are of value in operating the system 100 .
- one table may store identifiers for particular ones of the computing devices 102 A-C, where a particular device may be identified by a cookie that it stores and passes to the security server system 106 . That device identifier may then be related to the variety of parameters, such as the user agent parameter just discussed, and additional parameters, which may include hardware identifiers, operating system identifiers, and software identifiers, among other things.
- the system 100 may correlate a particular device to particular configuration information and to configuration parameters reported by the device.
- a server system can also specify a seed to be used before generating a random number, or specify another random number generation method (and the initial state of the pseudorandom number generator (PRNG), and the choice threshold value, such that the sequence of fields chosen will be known by the server. This can be used to force the client to generate an “uncompressed” value for a field that is unknown by the client. It can also be used to allow the server to have more control over the data flow (more or less data), and can even be used as a mechanism for determining when a malicious client is sending data in a non-compliant format, which could be used to determine that the client is, in fact, addled with malware.
- PRNG pseudorandom number generator
- hash values that have already been correlated with original text may also be tested by other incoming original text.
- the security server system 106 might not normally perform a hash on incoming original text if the system 106 has determined that there is already a correlation for that text in the table.
- a random number approach similar to that used on the computing devices 102 A-C may be used so that the security server system 106 periodically does perform such a hashing and comparison so as to confirm the accuracy of the data in the table. If the system 106 determines that there is an inaccuracy, because the hash value generated for an incoming string of text does not match a pre-existing hash value in the system for that text, the system 106 may generate an exception and alert an operator of the system 106 .
- the techniques discussed here have been associated with communications for the delivery of information related to browser environment and user or automated interactions with web pages, they may also, in appropriate circumstances, be applied more generally.
- other data that is reported at periodic intervals and is common as between a substantial portion of those reporting events may be compressed using the techniques here, and interpreted using uncompressed (or losslessly compressed) messages in some instances of the reporting, and lossy compressed messages corresponding to the same content in other instance of reporting.
- Various mechanisms including those discussed above and below, may be used to identify that the compressed and uncompressed messages match each other in their content, and to then associate the compressed messages with the uncompressed content.
- a stand-alone application for a particular organization may report information to a server system, and may be programmed to use the sometimes-compressed/sometimes-uncompressed techniques described here to transmit necessary data to the server system (particularly when the data is largely repetitive as between different reporting events for the data).
- FIG. 2 is a schematic diagram of a system 200 for performing deflection and detection of malicious activity with respect to a web server system.
- the system 200 may be the same as the system 100 discussed with respect to FIG. 1 , and is shown in this example to better explain the interrelationship of various general features of the overall system 200 , including the use of the reporting of compressed and uncompressed versions of the same strings in order to conserve bandwidth (for compressed representations) and to determine what the compressed representations represent (for uncompressed representations).
- the system 200 in this example is a system that is operated by or for a large number of different businesses that serve web pages and other content over the internet, such as banks and retailers that have on-line presences (e.g., on-line stores, or on-line account management tools).
- the main server systems operated by those organizations or their agents are designated as web servers 204 a - 204 n , and could include a broad array of web servers, content servers, database servers, financial servers, load balancers, and other necessary components (either as physical or virtual servers).
- a set of security server systems 202 a to 202 n are shown connected between the web servers 204 a to 204 n and a network 210 such as the internet. Although both extend to n in number, the actual number of sub-systems could vary. For example, certain of the customers could install two separate security server systems to serve all of their web server systems (which could be one or more), such as for redundancy purposes.
- the particular security server systems 202 a - 202 n may be matched to particular ones of the web server systems 204 a - 204 n , or they may be at separate sites, and all of the web servers for various different customers may be provided with services by a single common set of security servers 202 a - 202 n (e.g., when all of the server systems are at a single co-location facility so that bandwidth issues are minimized).
- Each of the security server systems 202 a - 202 n may be arranged and programmed to carry out operations like those discussed above and below and other operations.
- a policy engine 220 in each such security server system may evaluate HTTP requests from client computers (e.g., desktop, laptop, tablet, and smartphone computers) based on header and network information, and can set and store session information related to a relevant policy.
- the policy engine 220 may be programmed to classify requests and correlate them to particular actions to be taken to code returned by the web server systems (for transmission to requesting clients) before such code is served back to a client computer.
- the policy information may be provided to a decode, analysis, and re-encode module 224 , which matches the content to be delivered, across multiple content types (e.g., HTML, JavaScript, and CSS), to actions to be taken on the content (e.g., using XPATH within a DOM), such as substitutions, addition of content, and other actions that may be provided as extensions to the system.
- content types e.g., HTML, JavaScript, and CSS
- actions to be taken on the content e.g., using XPATH within a DOM
- substitutions e.g., addition of content
- other actions e.g., addition of content
- the different types of content may be analyzed to determine naming that may extend across such different pieces of content (e.g., the name of a function or parameter), and such names may be changed in a way that differs each time the content is served, e.g., by replacing a named item with randomly-generated characters.
- Elements within the different types of content may also first be grouped as having a common effect on the operation of the code (e.g., if one element makes a call to another), and then may be re-encoded together in a common manner so that their interoperation with each other will be consistent even after the re-encoding.
- a rules engine 222 may store analytical rules for performing such analysis and for re-encoding of the content.
- the rules engine 222 may be populated with rules developed through operator observation of particular content types, such as by operators of a system studying typical web pages that call JavaScript content and recognizing that a particular method is frequently used in a particular manner. Such observation may result in the rules engine 222 being programmed to identify the method and calls to the method so that they can all be grouped and re-encoded in a consistent and coordinated manner.
- the decode, analysis, and re-encode module 224 encodes content being passed to client computers from a web server according to relevant policies and rules.
- the module 224 also reverse encodes requests from the client computers to the relevant web server or servers.
- a web page may be served with a particular parameter, and may refer to JavaScript that references that same parameter.
- the decode, analysis, and re-encode module 224 may replace the name of that parameter, in each of the different types of content, with a randomly generated name, and each time the web page is served (or at least in varying sessions), the generated name may be different.
- the name of the parameter is passed back to the web server, it may be re-encoded back to its original name so that this portion of the security process may occur seamlessly for the web server.
- a key for the function that encodes and decodes such strings can be maintained by the security server system 202 along with an identifier for the particular client computer so that the system 202 may know which key or function to apply, and may otherwise maintain a state for the client computer and its session.
- a stateless approach may also be employed, whereby the system 202 encrypts the state and stores it in a cookie that is saved at the relevant client computer. The client computer may then pass that cookie data back when it passes the information that needs to be decoded back to its original status. With the cookie data, the system 202 may use a private key to decrypt the state information and use that state information in real-time to decode the information from the client computer.
- Such a stateless implementation may create benefits such as less management overhead for the server system 202 (e.g., for tracking state, for storing state, and for performing clean-up of stored state information as sessions time out or otherwise end) and as a result, higher overall throughput.
- An instrumentation module 226 is programmed to add instrumentation code to the content that is served from a web server.
- the instrumentation code is code that is programmed to monitor the operation of other code that is served. For example, the instrumentation code may be programmed to identify when certain methods are called, when those methods have been identified as likely to be called by malicious software. When such actions are observed to occur by the instrumentation code, the instrumentation code may be programmed to send a communication to the security server reporting on the type of action that occurred and other meta data that is helpful in characterizing the activity. Such information can be used to help determine whether the action was malicious or benign.
- the instrumentation code may also analyze the DOM on a client computer in predetermined manners that are likely to identify the presence of and operation of malicious software, and to report to the security servers 202 or a related system.
- the instrumentation code may be programmed to characterize a portion of the DOM when a user takes a particular action, such as clicking on a particular on-page button, so as to identify a change in the DOM before and after the click (where the click is expected to cause a particular change to the DOM if there is benign code operating with respect to the click, as opposed to malicious code operating with respect to the click).
- Data that characterizes the DOM may also be hashed, either at the client computer or the server system 202 , to produce a representation of the DOM (e.g., in the differences between part of the DOM before and after a defined action occurs) that is easy to compare against corresponding representations of DOMs from other client computers.
- instrumentation code may also be used by the instrumentation code to generate a compact representation of the DOM or other structure expected to be affected by malicious code in an identifiable manner.
- the instrumentation module 226 or another component may also provide a user telemetry script or other code for causing the client device receiving the other code to communicate with the server system after the code is transmitted.
- additional code may include code that causes the client devices to return configuration information about themselves, and to control whether they return the information in a compressed or native state, in the manners described above.
- the module 226 may also generate and provide to the client devices a request frequency value that helps control how often the native text is transmitted back to the system instead of the compressed form of the text.
- One or more modules may also control the receipt of such configuration information, the storage of the information, and the correlation of the compressed data (e.g., being used as an index value for a table) and the corresponding original form of the data.
- Uninfected client computers 212 a - 212 n represent computers that do not have malicious code programmed to interfere with a particular site a user visits or to otherwise perform malicious activity.
- Infected client computers 214 a - 214 n represent computers that do have malware, or malicious code ( 218 a - 218 n , respectively), programmed to interfere with a particular site a user visits or to otherwise perform malicious activity.
- the client computers 212 , 214 may also store the encrypted cookies discussed above and pass such cookies back through the network 210 .
- the client computers 212 , 214 will, once they obtain the served content, implement DOMs for managing the displayed web pages, and instrumentation code may monitor the respective DOMs as discussed above. Reports of illogical activity (e.g., software on the client device calling a method that does not exist in the downloaded and rendered content) can then be reported back to the server system.
- illogical activity e.g., software on the client device calling a method that does not exist in the downloaded and rendered content
- each web site operator may be provided with a single security console 207 that provides analytical tools for a single site or group of sites.
- the console 207 may include software for showing groups of abnormal activities, or reports that indicate the type of code served by the web site that generates the most abnormal activity.
- a security officer for a bank may determine that defensive actions are needed if most of the reported abnormal activity for its web site relates to content elements corresponding to money transfer operations—an indication that stale malicious code may be trying to access such elements surreptitiously.
- a central security console 208 may connect to a large number of web content providers, and may be run, for example, by an organization that provides the software for operating the security server systems 202 a - 202 n .
- Such console 208 may access complex analytical and data analysis tools, such as tools that identify clustering of abnormal activities across thousands of client computers and sessions, so that an operator of the console 208 can focus on those clusters in order to diagnose them as malicious or benign, and then take steps to thwart any malicious activity.
- the console 208 may have access to software for analyzing telemetry data received from a very large number of client computers that execute instrumentation code provided by the system 200 .
- Such data may result from forms being re-written across a large number of web pages and web sites to include content that collects system information such as browser version, installed plug-ins, screen resolution, window size and position, operating system, network information, and the like.
- user interaction with served content may be characterized by such code, such as the speed with which a user interacts with a page, the path of a pointer over the page, and the like.
- the telemetry data may also include the received data that characterizes the then-current conditions of each of the client devices, such as the browser and operating systems that they were running, and other appropriate information.
- Such collected telemetry data may be used by the console 208 to identify what is “natural” interaction with a particular page that is likely the result of legitimate human actions, and what is “unnatural” interaction that is likely the result of a bot interacting with the content.
- Statistical and machine learning methods may be used to identify patterns in such telemetry data, and to resolve bot candidates to particular client computers. Such client computers may then be handled in special manners by the system 200 , may be blocked from interaction, or may have their operators notified that their computer is potentially running malicious software (e.g., by sending an e-mail to an account holder of a computer so that the malicious software cannot intercept it easily).
- FIG. 3 is a flow chart of a process for reducing bandwidth requirements between computers.
- the process involves providing client computers with code that causes the computers to report back aspects of their operation. Different ones of the client computers are caused to report the information in compressed form, while others of the client devices are caused to report the same information in an original uncompressed, or plaintext form.
- the process can then use the combination of compressed and uncompressed reported information to correlate the compressed representations with the uncompressed representations, even though no particular computer or transmission provided such a correlation for the server system that served the code.
- the server system may make the correlation, for example, by performing a compression of received uncompressed code in a manner that matches the way that one or more of the client devices performed the compression of the same code or data.
- the process begins at box 302 , where the server system serves Web code to a plurality of different client devices.
- the Web code may be code for a particular webpage, for multiple related webpages, or for various unrelated webpages associated with different websites, including websites from different domains.
- the Web code may be recoded from what is initially served by a Web servers, such as by rewriting the names of particular functions or other elements in unpredictable manners but in a way that is consistent across all of the elements being served (e.g., so that the code does not break when executed and so that calls made to a particular function or other element are changed according to the changes made in the name of the element).
- supplemental code is served by the system.
- the supplemental code may be served along with the Web code in a single transaction, or may be served separately.
- the supplemental code may include, for example, instrumentation code and telemetry code that causes the receiving client device to monitor the operation of the Web code that is served to the device and potentially to report back on such operation to a security server system, if the monitoring determines that anomalous activity is occurring on the client device.
- Other code may also be served, such as parameter values that may affect the way in which the supplemental code operates, such as a request frequency number described above, and other appropriate values.
- the server system may have waited after serving both the Web code and the supplemental code, and may subsequently receive, from the client or clients to whom the code was served, hashed representations for configuration.
- Those representations may represent a variety of parameters that are relevant to the client devices from which they come, including identifiers for the current configuration state of a particular client device.
- the particular parameter may be identified, and the value of the identified may be identified by the hash code that one of the client devices generated by hashing the plaintext parameter value.
- a number of different parameters may be reported on for each client device, and even more parameters may be reported on across a universe of client devices. For example, Web code served from a certain webpage may be accompanied by instrumentation code that reports back on certain parameters of a device, while Web code served for another webpage may be accompanied by code that reports back on other parameters.
- Such representations may also be associated with identifiers for client devices from which they were received, so that the particular configuration information for those devices may be determined later, even if it cannot be determined when the hashed representations are initially received.
- plaintext representations are received from one or more client devices.
- the plaintext representations may have been transmitted by those client devices in response to the client devices executing instrumentation or telemetry code that instructed the transmission of such plaintext versions of the information to be transmitted (e.g., upon the client device choosing to transmit plaintext rather than a compressed representation).
- instrumentation or telemetry code that instructed the transmission of such plaintext versions of the information to be transmitted (e.g., upon the client device choosing to transmit plaintext rather than a compressed representation).
- the security system may be programmed to first compress those plaintext representations such as by hashing them. The compression may occur according to a mechanism that matches a known hashing mechanism to be operating on the client devices in cooperation with the instrumentation and telemetry code that was served to those client devices.
- the security system will now have a correlation between a particular plaintext representation and a particular hash value.
- the system may then compare that hash value to any of the hashed values that have previously been received, at box 314 , and may then correlate whatever previously-received hash values were received to the plaintext representation that was later received, at box 316 .
- the initial transfer of a particular piece of data may be in plaintext form, so that the database would be populated with a plaintext representation and a hash representation simultaneously. Later transmissions of plaintext representations may simply be matched against the plaintext column of the database, and the devices that sent those plaintext representations may be correlated with the hashed value as an index value for those devices.
- the plaintext values that are later received may always be hashed, and the hashed values may be compared against the database if that is a more efficient operation of the system computationally. Also, periodically, plaintext representations and their hash values may be checked against the table to ensure that there are no errors in the data. In addition, other values that represent the plaintext may be transmitted along with the hashed representations of the plaintext so as to ensure that the system is not receiving overlapping hash values that match each other but that each represent different plaintext representations.
- characteristics of infected computers are identified using information gleaned from the previous steps.
- the hashed values may be used as data in statistical analysis techniques, such as techniques that may attempt to identify clusters of activity within a population of computers, such as a population of hundreds of thousands of computers. Clustering may indicate anomalous activities by those computers, and the hash values may then be used to determine what configuration information is possessed in common by computers within that cluster.
- the analysis may determine that a large majority of computers having anomalous behavior are running a recently released operating system or browser version (i.e., that anomalous behavior is clustered around a dimension associated with that particular value of the user agent parameter for a population of machines). Such a determination may be evidence of a vulnerability of such browser or operating system version to Mal Ware. An operator of the system described here may then act upon such information, such as to cause the browser or operating system to be updated or the security hole to otherwise be plugged.
- FIG. 4 is a swim lane diagram of a process for transferring data between client computers and a server system.
- the process like those discussed above, involves transmitting content to a server system, in most instances, in a compressed manner from which the identity of the original content cannot be determined (a lossy compression like forming a hash).
- the content can be transmitted in an uncompressed or losslessly compressed form
- the received data may be compressed using a process equivalent to the process that was used by clients on the other received content, and the compressed form may be matched to the compressed forms received in that other received content. In this way, the original form of the other received content (both past and future) can be inferred.
- a client device requests a web page, such as via a GET or POST method.
- a request may be directed to a particular URL served by a web server system of a particular organization.
- the request may result in the web server system identifying appropriate code to respond to the request, which may include static code and dynamic code, and may take the form of HTML, CSS, and JavaScript, among others.
- the web server system serves the responsive code.
- the served code is intercepted at box 406 by a security server system that, e.g., the operator of the web server system has added as an intermediary for providing security for the web server system.
- a security server system that, e.g., the operator of the web server system has added as an intermediary for providing security for the web server system.
- a third party may provide a security system that can be added modularly to a company's web server system without having to affect the web server system in any substantial manner.
- the intermediary functionality may be integrated in the web server system.
- the intermediary server system may be physically location within the same building as the web server system (for minimizing latency and maximizing the ability to coordinate systems) or in a separate location that requires communication through a network, including the Internet,
- the security server system intercepts the code and modifies it.
- the names of certain functions may be changed in a sufficiently random or arbitrary manner that the new names cannot be anticipated by malware running on the clients.
- the changes may be coordinated across different types of code (e.g., HTML, CSS, and JavaScript) where the names occur, so that the code functions the same as the code it replaced.
- the changes are made to latent code whose operation a user does not see, and static code.
- the code is appended with monitoring and reporting code.
- Such code may monitor the DOM that is created on the client when the served code is rendered, or may monitor attempts to interact with the code, and may characterize and report any abnormal activity.
- Such code may also report other status information about a client, such as configuration information that describes the features of the client system. In certain situations, a complete picture of what is occurring in the browser or other application (e.g., a specific app programmed for the company that serves the code).
- the reporting code may in particular include code for making a determination whether to report particular information in a compressed versus an uncompressed form, and then to transmit the data back to the server system accordingly.
- the client renders the web page by executing the various types of served code, and perhaps by acquiring code form other sources in addition to the code that was initially served by the web server system (whether from the organization that operates the web server system or from one or more other organizations).
- the serving and executing of code described here would be repeated across thousands or more different client devices that may each vary in different ways, such as by having different base (the basic computer) and extended hardware (e.g., added graphics cards or RAM), operating systems, installed and executing applications, and executing browser plug ins.
- each rendering of the web page may be performed in a different manner for different ones of the client devices, and even for the same client device in different sessions.
- the client device generates characterization and activity data that is to be sent back to the server systems.
- the box is labeled with a “1” to indicate that this step represents a subset of the devices that are served the web code, and are the devices that hash the data that is to be reported so as to lower the bandwidth required for such reporting.
- the vast majority of instances would be established to report in such a manner so as to significantly reduce the overhead of transmitting the data.
- characterization data represents status of the client device, such as hardware and software on the device, whereas activity data represents actions that have occurred on the device, particularly since the device received the served web code (e.g., activities between the served code and other code that is on the device).
- the characterization data may be sent to one server system, while the activity data may be sent to another, or they may be sent to the same server system.
- certain data may be sent according to the compressed/uncompressed scheme described in this document when the data is expected to be common across many devices, so that the original value of the content for devices that compress their content can be inferred from the uncompressed content (where, unless otherwise noted, uncompressed content includes content whose original form can be determined by a server system that receives it, and thus includes losslessly compressed content).
- Other data may be sent in a normal manner, without the pairing of compressed/uncompressed transmission, such as where the content is not typically common as among different machines, so that there would be relatively little value in trying to infer the original content from transmissions made by other machines.
- an analysis system receives the reported data, which may include activity data.
- the analysis system may use such activity data to identify that certain normal or anomalous activities have occurred on a certain device, and may conduct analysis on similar activity data received from a large number of other devices to identify clusters of common activity so as to determine that malware is taking advantage of such devices.
- the analysis system may also be provided with characterization data so that it can determine characteristics of the devices that are being affected by the malware.
- the client device may provide similar data to the security server system, as indicated at box 416 .
- the security server system may then associate the particular client device with the hashed forms of the compressed content that is sent (as the analysis system may do if it receives only hashed data).
- the security server system has received no unhashed form of the content, so it does not know what the original form of the content was.
- the system may simply associate an identifier for the particular device with the hashed form of the received content (or may simply index upward a count of the number of clients reporting the content of the particular form of hash).
- multiple different fields may be reported in a hashed manner, such as one or more fields that identify hardware for a device, and one or more fields that identify software executing on the device.
- Each feature of the device e.g., make and model, operating system, amount of RAM, etc.
- another client (indicated by the circled “2”) also generates and reports characterization and activity data.
- the particular device does not compress the content that it reports—e.g., because it selected a number pseudo-randomly that does not exceed a predetermined level that was provided with the web page code.
- the analysis system may receive at least some of the generated content at box 420 (which may be the same content as received at box 414 or may contain some fields whose parameters are the same as those received at box 414 ), though here the content would be received in uncompressed form (e.g., either as plaintext or in a losslessly compressed format).
- the analysis system may compress the received uncompressed content to form a hash value and may then compare it to compressed content that was previously received. If the hash value matches a hash value stored form Box 414 , then the original content may be associated by the system with the other devices that previously reported the hashed value, as may future devices that report the hashed value. Alternatively, or in addition, the analysis system can add to a number of devices that have reported as having the particular parameter.
- the second client device can report the characterization and activity data to the security server system, and at box 422 , that system can generate a hash value for it.
- the security server system can report the characterization and activity data to the security server system, and at box 422 , that system can generate a hash value for it.
- certain other fields may have been reported in hashed form or may always be reported by all devices in uncompressed form.
- the security server system associates the particular parameters received from box 412 with the other instances of reporting the same content (as determined by comparing the just-generated hash value with previously-received hash values form the other devices).
- the analysis system can request ID and parameter data (box 426 ) from the security server system.
- the security server system (box 428 ) may gather and transmit such data, and the analysis system may identify common features of anomalously-acting machines (box 430 ) using such data.
- the analysis system may receive activity data and use such data to identify clusters of common activity, or otherwise identify potential problems that arise in the operation of a number of different client devices.
- the analysis system may not know the characterization data for the devices, and may only seek such data from the security server system after identifying the problem.
- Such follow-up information gathering may then be used by the analysis system to identify features of the devices that are determined to be acting anomalously, such as by determining that they all are executing the same browser program, and perhaps a common version or range of versions of that program.
- the analysis system may repeat operations that are performed by the security server system, such as in the inferring of the original content of compressed messages via compressing of received uncompressed messages.
- the security server system and the analysis server system may be part of the same system or separate systems.
- a retailer may manage both systems along with a web server system.
- a third-party may operate the analysis server system from its own facility, and can assist customers with operating their particular security server systems on their premises, with their web server systems.
- the third-party may aggregate activity over a large number of served content in such manner, and may more readily identify anomalous behavior than could a single organization serving only a fraction of such content.
- FIGS. 5A and 5B show, respectively, state diagrams for a client and a server operating according to the mechanisms described above.
- the client device begins its operations by which it prepares information for transmission to the server system and performs the transmission.
- a determination is made whether more fields need to be encoded for transmission. If not, then the client waits until a next time that processing and transmission is needed.
- a random number (which may be pseudorandom or otherwise less than exactly random) is generated at box 506 .
- the number may be generated for each overall transmission or for each field within a transmission (so that some field values may be compressed and some not).
- the client determines whether the generated number exceeds a threshold. That threshold may be a predetermined value that is relatively permanent and stored by the client for a long time, or may be highly variable, where the threshold is transmitted with code recently received by the client, or is accessed at run time by the client (e.g., by submitting a GET function to a remote server system).
- the client sends a raw version of the relevant field, such as in plaintext or losslessly compressed form of the content for the field (box 512 ). If the threshold is not exceeded, then a hashed version of the content is sent (box 510 ). Of course, the determination may be made inversely, so that the hashed form is sent if the threshold is exceeded (and/or matched), and the raw data is sent if it is not (and/or is matched).
- the process begins, such as by the server determining that it has received data for a plurality of fields, where the data needs to be interpreted by the server system. If there are no more fields to process, the system returns to a rest state, but if there are, then the server analyzes the next field in line and determines whether it is in raw form or hash form (box 524 ). If it is in raw form, then the server hashes the field using a hashing technique that matches a technique that the server knows to be performed by various clients that are reporting data to it (box 532 ).
- the server then associates the hash result with the raw content (box 534 ).
- the system can then use such a correlation between the hash result and the raw data to interpret other communications form other clients that contain only the hash result.
- the system can use the correlation to infer what the raw form at the client was when only the hash form is received.
- the system performs a lookup on the hash form (box 526 ). The system determines whether the hash form of the field is found in the system (box 528 ), so as to indicate that a correlation has already been stored between the hash form and the raw form. If the hash form is found, then the system can get the raw value a(box 530 ) and act accordingly. If the field is not found (e.g., because the value for the field has not previously been received in raw form), then the occurrence of the receipt of the hash form from the client may be saved and noted, and the system may return to check if additional fields need processing.
- FIG. 6 is a schematic diagram of a computer system 600 .
- the system 600 can be used for the operations described in association with any of the computer-implement methods described previously, according to one implementation.
- the system 600 is intended to include various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
- the system 600 can also include mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices.
- the system can include portable storage media, such as, Universal Serial Bus (USB) flash drives.
- USB flash drives may store operating systems and other applications.
- the USB flash drives can include input/output components, such as a wireless transmitter or USB connector that may be inserted into a USB port of another computing device.
- the system 600 includes a processor 610 , a memory 620 , a storage device 630 , and an input/output device 640 .
- Each of the components 610 , 620 , 630 , and 640 are interconnected using a system bus 650 .
- the processor 610 is capable of processing instructions for execution within the system 600 .
- the processor may be designed using any of a number of architectures.
- the processor 610 may be a CISC (Complex Instruction Set Computers) processor, a RISC (Reduced Instruction Set Computer) processor, or a MISC (Minimal Instruction Set Computer) processor.
- the processor 610 is a single-threaded processor. In another implementation, the processor 610 is a multi-threaded processor.
- the processor 610 is capable of processing instructions stored in the memory 620 or on the storage device 630 to display graphical information for a user interface on the input/output device 640 .
- the memory 620 stores information within the system 600 .
- the memory 620 is a computer-readable medium.
- the memory 620 is a volatile memory unit.
- the memory 620 is a non-volatile memory unit.
- the storage device 630 is capable of providing mass storage for the system 600 .
- the storage device 630 is a computer-readable medium.
- the storage device 630 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device.
- the input/output device 640 provides input/output operations for the system 600 .
- the input/output device 640 includes a keyboard and/or pointing device.
- the input/output device 640 includes a display unit for displaying graphical user interfaces.
- the features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them.
- the apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output.
- the described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
- a computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result.
- a computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer.
- a processor will receive instructions and data from a read-only memory or a random access memory or both.
- the essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data.
- a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks.
- Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
- magnetic disks such as internal hard disks and removable disks
- magneto-optical disks and CD-ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- ASICs application-specific integrated circuits
- the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer. Additionally, such activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
- activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- the features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them.
- the components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), peer-to-peer networks (having ad-hoc or static members), grid computing infrastructures, and the Internet.
- LAN local area network
- WAN wide area network
- peer-to-peer networks having ad-hoc or static members
- grid computing infrastructures and the Internet.
- the computer system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a network, such as the described one.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Power Engineering (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
Description
- This application claims the benefit under 35 U.S.C. § 120 as a Continuation of U.S. patent application No. 14/980,231, filed on 2015-12-28, which is a Continuation of U.S. patent application No. 14/160m107, filed on 2014-1-21, the entire contents of which are hereby incorporated by reference as if fully set forth herein.
- This document generally relates to computer communications.
- The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
- Web content, such as HTML or JavaScript for generating web pages, may contain application-like functionality that is interpreted and executed within a visitor's browser, or in a similar application. The general goal with HTML and other web technologies is to make them work, and work similarly, across many different platforms (e.g., Mac, PC, Linux, etc.).
- To maximize the functionality of web content, it can be relevant for a system that serves the content to know the configurations of computers (whether desktop, smartphone, tablet, or other) that are being served the content. For example, particular knowledge can be obtained by identifying the type of browser that is rendering a web page, the operating system on which the browser is running, and plug ins that might also be operating on such computers. However, this additional supporting information must generally be sent from the various client computers to the server system, and such transmission adds overhead to the functioning of a browser presenting a web page or other application, which overhead is not directly responsible for improving operation of the page.
- This document describes systems and techniques by which various user computing devices (computers such as desktops, laptops, tablets, and smartphones) can submit information to a server system in a manner that lowers the bandwidth required for such reporting. Specifically, certain of the computing devices can send information in a lossy compressed format (e.g., as a hash of the original information), while others can send the same information in an uncompressing format (e.g., as the original plaintext).
- The compressed format may be highly compressed, such as by a lossy one-way function so that the server system cannot immediately determine what original string a compressed submission is indicative of (e.g., via a hash function or other lossy compression function).
- To determine what the compressed submissions represent, the server system compresses any received uncompressed submissions (or submitted with lossless compression) using the same technique used by the client devices to perform their compression, at which point the server system knows the correlation between the uncompressed and compressed representations, and can then correlate any previously- or later-received compressed representations back to the original raw data. The percentage of the client computers reporting raw data may be much smaller than those reporting compressed data, so that the overall bandwidth of the system is substantially reduced. For example, each of the computing devices may determine whether it should submit a compressed representation of the data, or instead, an uncompressed representation by generating a random number (again, e.g., using standard JavaScript functions), and only send a particular format or representation if the generated number is above or below a predetermined number, as the case may be.
- The server system may provide a biasing value to the computing devices when it serves web code so as to push the random number higher or lower, so as to affect the likelihood that any particular computing device will send uncompressed, raw data instead of compressed data. More frequent submission of uncompressed representations will allow a server system to more quickly identify the real meaning of data that newly arrives, e.g., when new features arrive on the computing devices (e.g., new plug ins are announced), but could cause higher bandwidth usage in a pool of computing devices. Thus, an operator of a server system may use the biasing value to match its desire for fast reaction versus its desire for lower bandwidth requirements.
- To further minimize the amount of data transfer needed, the compression algorithm may be one that is available from public libraries, such as standard JavaScript hash algorithms. In this manner, the server system may automatically obtain plaintext representations of new data as it arrives in a pool of computers (e.g., all computers trying to access a particular retailer's web site), but may also determine how broadly such information has spread without having to send the potentially voluminous plaintext representation for very many of the computing devices.
- Generally, hashing algorithms are selective enough that very few collisions will be seen between hashes (i.e., two different strings of text sent by computing devices will seldom generate the same hash value). When there are collisions, however, a server system will not be able to determine what is meant by such a compressed value when it arrives (it will be ambiguous as between the two or more source strings that generate the compressed value). Thus, the system just discussed may also include provisions for resolving such collisions. For example, a computing device may perform a secondary compression that uses a different algorithm than the primary compression, so that if the values of both compressions do not match across different submissions, then the source text for those different submissions is known to be different. Alternatively, or in addition, a length of the source string may also be submitted as to serve as yet another separate check on the source string.
- In particular implementations of such techniques, the collected data may be configuration data for the computing devices, which may include, for example, the make and model of the computer, the make and version of the operating system and the web browser that is being used, the identity of active plug ins and other applications currently executing on the computing device in addition to the browser, among other things, such as installed fonts, screen resolution, etc. Collected data may also include activity data that identifies actions that have been taken on the computer, including actions by third-party software that appears to be anomalous (e.g., attempts to interact with the revised web code in an invalid manner). Such data may be collected by one or more central server systems for diagnostics purposes, including for identifying the state of machines when a program throws an error, and for identifying common characteristics of computing devices that are exhibiting fraudulent or other anomalous behavior. For example, a criminal group may have a plug in or other software surreptitiously distributed to thousands of computers spread across the world to form a so-called bot net, and the server system discussed here may use reporting information from such computers to more quickly and accurately identify the presence of a new bot net that is emerging, and the behavior of that bot net (e.g., if common reports of malicious activity are coming from a particular operating system running a particular browser version).
- Various implementations are described herein using hardware, software, firmware, or a combination of such components. In some implementations, a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting parameters of the computing devices; receiving from different ones of the computing devices, a plaintext representation of a particular parameter of a first of the computing devices, and a hashed representation of the same parameter of a second of the computing devices; hashing the plaintext representation of the particular parameter to create a hash value, and comparing the hash value to the hashed representation; and based on a determination that the hash value matches the hashed representation, correlating the hashed representation to the plaintext representation on the computer server system, wherein the code for reporting parameters of the computing devices includes code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation.
- These and other implementations can optionally include one or more of the following features. The code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation can include biasing data that affects a frequency with which the computing devices select to send the plaintext representation or the hashed representation.
- The method can further include receiving from the computing devices, plaintext representations and hashed representations of a plurality of different parameters of the computing devices; hashing the received plaintext representations to created hashed values; and using correlations between the hashed values and the received plaintext representations to identify parameters represented by the hashed representations. The method can further include using the hashed representation and the plaintext representation to identify characteristics of malware executing on the computing devices.
- In some implementations, a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting status of the computing devices; receiving from one or more of the computing devices, first data that indicates a parameter of the one or more computing devices, the first data in a compressed format; receiving from one or more others of the computing devices, second data that indicates the parameter of the one or more others of the computing devices, the second data in an uncompressed format; and compressing the second data and comparing the compressed second data to the first data to correlate the first data to the second data, wherein the code for reporting status of the computing devices includes code for allowing the computing devices to determine whether to send the first data or the second data.
- These and other implementations can optionally include one or more of the following features. The code for allowing the computing devices to determine whether to send the first data or the second data can include biasing data that affects a frequency with which the computing devices select to send the first data or the second data. The first data can be compressed on the computing devices using hashing. The server system can be configured to not send hashing algorithm information to the computing devices. The method can further include using the compressed format to represent the parameter in identifying aggregate activity by multiple of the computing devices. The method can further include determining from the aggregate activity by multiple of the computer devices whether ones of the multiple computing devices is infected with malware. The computer server system can be an intermediary security server system that is separate from a web server system that generates and serves the web code. The method can further include comparing information sent with the compressed second data to information derived from the received first data to determine whether the compressed second data was generated from data that matches the first data.
- In some implementations, one or more non-transitory storage devices can store instructions that, when executed by one or more computer processors, perform operations comprising: serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting status of the computing devices; receiving from one or more of the computing devices, first data that indicates a parameter of the one or more computing devices, the first data in a compressed format; receiving from one or more others of the computing devices, second data that indicates the parameter of the one or more others of the computing devices, the second data in an uncompressed format; and compressing the second data and comparing the compressed second data to the first data to correlate the first data to the second data, wherein the code for reporting status of the computing devices includes code for allowing the computing devices to determine whether to send the first data or the second data.
- These and other implementations can optionally include one or more of the following features. The code for allowing the computing devices to determine whether to send the first data or the second data can include biasing data that affects a frequency with which the computing devices select to send the first data or the second data. The first data can be compressed on the computing devices using hashing. The operations can further include using the compressed format to represent the parameter in identifying aggregate activity by multiple of the computing devices. The operations can further include determining from the aggregate activity by multiple of the computer devices whether ones of the multiple computing devices is infected with malware. The computer server system can include an intermediary security server system that is separate from a web server system that generates and serves the web code. The operations can further include comparing information sent with the compressed second data to information derived from the received first data to determine whether the compressed second data was generated from data that matches the first data.
- In some implementations, a computer-implemented system includes: a first data communication interface arranged to communicate with a web server system; a second data communication interface arranged to communicate with clients that request content from the web server system; a compressed code interpreter programmed to identify an original form of compressed content received from particular ones of the clients by (a) compressing original content received from other ones of the clients to form a compressed representation, and (b) comparing the compressed representation to the compressed content received from the particular ones of the clients, wherein compressed code interpreter compresses the original content using a technique that matches techniques used by the particular ones of the clients to compress the content.
- These and other implementations can optionally include one or more of the following features. The system can be further programmed to provide code to the clients that allows the clients to determine whether to provide compressed content or instead, uncompressed content to the system.
- In some implementations, a computer-implemented method can include serving, from a computer server system and to a plurality of different computing devices remote from the computer server system, web code and code for reporting parameters of the computing devices; receiving from different ones of the computing devices, a plaintext representation of a particular parameter of a first of the computing devices, and a hashed representation of the same parameter of a second of the computing devices; hashing the plaintext representation of the particular parameter to create a hash value, and comparing the hash value to the hashed representation; and based on a determination that the hash value matches the hashed representation, correlating the hashed representation to the plaintext representation on the computer server system, wherein the code for reporting parameters of the computing devices includes code for allowing the computing devices to determine whether to send a plaintext representation or a hashed representation.
- The features discussed here may, in certain implementations, provide one or more advantages. For example, a security intermediary system may be provided that does not add an appreciable level of bandwidth to the communication channel between a server system and the clients it services. The intermediary system may collect data that is relatively large compared to the bandwidth that it occupies, and may use that data for diagnosing problems with particular clients, and across large numbers of clients (e.g., by identifying the spread of malware threats). Moreover, a wide variety of data for various purposes may be transmitted using these techniques, and may be used for a wide variety of purposes once it is interpreted at the server system. Moreover, in certain implementations, the compressed representations can be used as database keys, thus further simplifying the operations recited herein.
- Other features and advantages will be apparent from the description and drawings, and from the claims.
- The appended claims may serve as a summary of the invention.
- In the drawings:
-
FIG. 1 is a schematic diagram of a system for providing compressed reporting of computing device information using a blind hash. -
FIG. 2 is a schematic diagram of a system for performing deflection and detection of malicious activity with respect to a web server system. -
FIG. 3 is a flow chart of a process for reducing bandwidth requirements between computers. -
FIG. 4 is a swim lane diagram of a process for transferring data between client computers and a server system. -
FIG. 5A is a representation of a state machine for client-side encoding. -
FIG. 5B is a representation of a state machine for server-side decoding. -
FIG. 6 is a block diagram of a generic computer system for implementing the processes and systems described herein. - Like reference numbers and designations in the various drawings indicate like elements.
- In the following description, for the purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
- It will be further understood that: the term “or” may be inclusive or exclusive unless expressly stated otherwise; the term “set” may comprise zero, one, or two or more elements; the terms “first”, “second”, “certain”, and “particular” are used as naming conventions to distinguish elements from each other does not imply an ordering, timing, or any other characteristic of the referenced items unless otherwise specified; the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items; that the terms “comprises” and/or “comprising” specify the presence of stated features, but do not preclude the presence or addition of one or more other features.
- This document discusses mechanisms for reducing bandwidth between client computing devices and server systems with which they communicate (where “clients” and “servers” are terms used generally, and do not require any sort of formal client-server architecture). Generally, the mechanisms are most useful where many different computing devices will be communicating the same data to the server system. For example, it may be beneficial to have computing devices report their configuration information to a server system so that the system can identify commonality in the operations of such devices, for example, to diagnose reasons for faults in the devices or to identify the emergence of malware on the devices in a large group of devices (e.g., all devices that access a banking or retail web site).
- The common data that is communicated may be communicated by some of the computing devices in its native form (e.g., plaintext) or another form in which its content can be directly determined (e.g., via lossless compression or encryption for which the server system receiving the data can accurately decompress or decrypt the data).
- Others of the devices may communicate the same data in a form from which it cannot be identified directly, such as by submitted a hash of the data. When the server system receives compressed representations of the text but has not yet received the original representation, it can save indications of the compressed representations in association with the computing devices from which they were received, without knowing the original representation. When the server system receives any uncompressed representations, it can compress them using the same algorithm that the client devices used, can store the correlation of the compressed representation to the original representation, and can use that correlation to resolve any compressed representations, whether associated with events reported from computing devices in the past or the future, to determine what the compressed representation actually represents.
- Some or all compressed representations may be accompanied by a secondary representation, that can be used to identify potential collisions between the compressed representation. In particular, because the compressed representations are smaller in size than the uncompressed representations, certain compressed representations will end up being repeated in a system—so that two identical compressed representations received by a server system could represent different original strings.
- Though proper selection of parameters will make such collisions relatively rare, where the volume of the different strings that need to be represented is extensive, the risk of a collision may be relevant. The secondary representation, then, may serve as a check on the main representation, as it will be extremely unlike that both would match even though the original text did not. Such secondary representation may be transmitted to the server system with the compressed representation, and may be formed, for example, by applying a second hash or other compression technique to the original text that uses a different algorithm, or by sending a value that represents a length of the original string.
- The compressed representations or other representations that correspond to the compressed representations may then be passed as identifiers for the original data to systems that can perform analysis using such data. For example, client devices may pass reports that indicate anomalous activity, such as efforts by a browser plug-in to access served code using defunct function names or the like (e.g., in a system that uses a security intermediary to change the function names with each serving of the web code).
- A fraud detection system may perform clustering analysis on the reported features of such computing devices, and may use the compressed representations as identifiers for the various reported features in performing such analysis. The analysis may be used to identify that device having particular characteristics (e.g., IP address, operating system, and browser) that have reported the existence of anomalous behavior, which may in turn be used to determine whether the anomalous behavior is benign (e.g., from a plug in that users intentionally installed) or malicious (e.g., code performing a “Man in the Middle” attack on their devices).
-
FIG. 1 is a schematic diagram of asystem 100 for providing compressed reporting of computing device information using a blind hash. In general, thesystem 100 is directed to presenting information from aweb server system 108 to a variety of computing devices 114A-C that are located remotely from theweb server system 108. - Examples of operators of such a
web server system 108 include on-line retailers and on-line banking systems, where the devices 114A-C belong to people trying to buy products or perform on-line banking transactions. Theweb server system 108 is shown as a row of servers along with a separate row of servers for asecurity server system 106, both in a single data center facility. Such arrangement is intended to indicate that, in one typical implementation, an operator of a web site may supplement itsmain server system 108 with asecurity server system 106 that it builds itself or that it acquires for a third party. - The
security server system 106 may physically and logically between theweb server system 108 and the network, which may include internet 104, and may intercept web code to be served to thevarious client devices 102A-C. - In the described example, the
system 100 operates by providing modified or recoded web code to the client computing device 102, where the modifications are relative to a web page that would normally be served to the client computing device without additional security measures applied. Web code may include, for example, HTML, CSS, JavaScript, and other program code associated with the content or transmission of web resources such as a web page that may be presented at a client computing device 102 (e.g., via a web browser or a native application (non-browser)). - The
system 100 can detect and obstruct attempts by fraudsters and computer hackers to learn the structure of a website (e.g., the operational design of the pages for a site) and exploit security vulnerabilities in the client device 102. For example, malware may infect the client device 102 and gather sensitive information about a user of the device, or deceive a user into engaging in compromising activity such as divulging confidential information. Man-in-the-middle exploits are performed by one type of malware that is difficult to detect on a client device 102, but can use security vulnerabilities at the client device 102 to engage in such malicious activity. - Served
code 110 shows an example of code that can be served to a requesting one of various of thecomputing devices 102A-C after the request is provided to theweb server system 108, the content from theweb server system 108 is intercepted or otherwise provided to thesecurity server system 106, and the code is changed and/or supplemented by thesecurity server system 106. Various portions of the servedcode 110 are shown schematically to actions that thesecurity server system 106 can take with respect to the code. -
Code 110A represents the original web code provided by theweb server system 108 with certain modifications made to it. For example, thesecurity server system 106 may change the names of functions in essentially random ways every time a set of content for a web page is served, where the changes are made consistently across the served code so as not to break internal references between pieces of the code. For example, references to a particular function may be made consistently across HTML, CSS, and JavaScript. For example, the following strings indicate HTML before and after alteration using a random number for textual replacement: - Original code:
-
<form action=“login.jsp” method=“post” name=“Login”> <input type=“text” id=“lastname_id” name=“lastname” Re-coded format: <form action=“login.jsp” method=“post” name=“imp0q6wNm”> <input typ=“text” id=“b24mpqdfKX” name=“aSkFjp5x1Y” - Such changes may be made so that malware on a client device that receives the code cannot easily identify the operational structure of the web site and/or automatically interact with the code so as to mislead a user into opening its security to the malware (e.g., for a Man in the Middle attack). By making the changes frequently enough and randomly enough that automated malware cannot interact with it predictably, the
security server system 106 interferes with such attacks by malware. -
Instrumentation code 110B is added to thecode 110A by thesecurity server system 106, and allows thesystem 100 to detect malware in addition to deflecting its efforts. In particular, theinstrumentation code 110B can execute in the background on thecomputing devices 102A-C and can monitor how thecode 110A operates and how other code on theparticular computing device 102A-C interacts with the execution ofcode 110A. For example, theinstrumentation code 110B can monitor the DOM made from thecode 110A at different points in time and may report back tosecurity server system 106 information that characterizes the current state of the DOM. Such information can be compared to information that indicates what the DOM should look like in order to determine whether other side is interfering with the execution ofcode 110A. Alternatively, or in addition, the instrumentation code can identify anomalous attempts by third-party code to interact with the operation ofcode 110A, such as for calls made to code 110A using “old” names for the code (e.g., names that were valid in a prior serving of the relevant web page but that are no longer relevant becausesecurity server system 106 is constantly changing the names so as to create a moving target for such third-party code to hit). - A
user telemetry script 110C is also provided to a requesting one ofcomputing devices 102A-C. The user telemetry script 100C may include code for managing communications between the relevant client device and thesecurity server system 106. Such communications may include transmission of information identified by the instrumentation code 100B described above, and other relevant information. In certain implementations, thesecurity server system 106 can be supplied additional information using the user telemetry script and after thecode 110A has been served, such as information that affects the manner in which theinstrumentation code 110B operates. For example, thesecurity server system 106 may receive a report from theuser telemetry script 110C that indicates that a third-party program is attempting to interact with the servedcode 110A, and may respond so as to have theinstrumentation code 110B perform certain operations to better understand the nature of the interaction occurring on the computing device. - A
request frequency code 110D may also be sent and may be as simple as a single number that biases theuser telemetry script 110C to return information to thesecurity server system 106 in its original form, or instead in a compressed form. For example, therequest frequency code 110D that is sent in this example is a value of 1000, which may have been selected by thesecurity server system 106 for a range between 0 and 1024 in this example. In turn, theuser telemetry script 110C may be programmed to select a random number between 0 and 1024, and to return the original text rather than a compressed version of the original text when the randomly-selected number exceeds 1000. As a result, original text will be returned by only about 2% of all computing devices that are served code from thesecurity server system 106 using this request frequency value. Others of the computing devices will return a compressed version of the text, such as a hash of the original text produced by the particular device. - Upon receiving the
code 110, theparticular client devices 102A-C may render respective webpages and establish document object models that represent the served page, in a familiar manner. User interactions with the webpage and associated code may then begin. At or around that time, theinstrumentation code 110B anduser telemetry script 110C may execute to return information about the configuration of a particular computing device to thesecurity server system 106. For example, the usertelemetry skip script 110C may return data that identifies the operating system of the particular computing device, the model of the particular computing device, the amount of RAM loaded on the computing device, other applications executing on the computing device, and similar information. In certain implementations, such functionality may be provided using a browser plug in that is programmed to perform a check of the environment for the machine on which it is running. Generally, JavaScript or VBScript can permit that measurement of User Agent, other HTTP header information, indirect measurements of the JavaScript execution environment, Plugin information, fonts, and screen information. - As shown by the arrow labeled with a 1 in a circle,
computing device 102A returns the numeric pair 24.16. These numbers represent, respectively, a hash of a textual string that represents the name and model of the browser that is running oncomputing device 102A. In the example here, all threecomputing devices 102A-C are running the “Chrome 2.3.21.04” browser release, as an example. Such information may be obtained by making a request that is to be responded to with the “user agent” string on the particular computing device, in a familiar manner. In the current example,computing device 102A delivered this compressed representation of the user agent string, because it generated a random number of 300, which is less than the request frequency number of 1000. - Similarly, when computing
device 102B received the servedcode 110, it generated a random number of 674, meaning that it too would send a compressed version of the user agent string, or 24.16. In both these examples, 24 has been selected as an example to represent a hash that may be created from such a string, and the number 16 represents the number of characters in that string. - The actual string itself can be seen as being transmitted from
computing device 102C back tosecurity server system 106. Here,computing device 102C selected a random number of one 1012, which is greater than the request frequency number of 1000. As a result,computing device 102C will be one of the 2% of all devices that report back the original, uncompressed (unhashed) version of the user agent string. - To better show the level to which an initial string can be compressed, the user agent string for Firefox on an Ipad is “Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405.” A compressed representation that indicates a hash and a length might be of the form 4528.111. As can be appreciated, the bandwidth for the latter is much lower than for the former.
- In the figure, operations of the
security server system 106 performed in response to receiving the communications fromdevices 102A-C are shown schematically as a two-column database entry belowsecurity server system 106 andWeb server system 108. The two columns are shown to indicate how a system may associate a compressed version of a string with the actual string itself. In a first representation shown by a 1 in a circle and corresponding to actions that would occur in response to the first transmission fromcomputing device 102A, the database has been populated with the hash value of 24 upon receiving that hashvalue form device 102A. Thesystem 100 does not, at that point, know what the original string representation for that value is (assume that the system did not receive earlier communications regarding the user agent string from other device), but stores thehash value 24 in anticipation that will eventually be able to determine what the original, plaintext value is. - In a second representation shown by the
number 2 in a circle and representing the transmission fromcomputing device 102B, the table has not changed because again, thesecurity server system 106 received only the hash code, and not the original version of the user agent string. Finally, at the bottom of the representation, the system receives a string of original plaintext and, as shown by the arrow labeled with “hash,” the system performs a hashing function on that plaintext that is the same as a hashing function that thesystem 100 knew to be provided by thecomputing devices 102A-C. For example, each of thecomputing devices 102A-C and thesecurity server system 106 may be programmed to use the same hashing algorithm as the Java hash algorithm, which is well known and readily available on many computing platforms. - With that hash value (24) in hand, the
security server system 106 may search the table for a matching value, and when it finds such a matching value, it may determine that that matching hash value is what corresponds to the original text. It may then update the table to correlate the particular hash value with the particular original plaintext. Such a correlation is shown in the row of the table labeled with a 3 in a circle. - This correlation may then be used with other parts of the system. For example, the
number 24 can be used throughout the system to represent the user agent string represented here (i.e., as a unique database index value). As some examples, a cluster analysis system like that discussed with respect toFIG. 2 below may use thenumber 24 to represent such a feature instead of using the full string representation. In other embodiments, yet a third representation for the feature may be used as an index representation. - The processing of the communication from
computing device 102C may also be accompanied by a determination that the full string is 16 characters in length. Such a value may be stored in yet a third column of the table (not shown) and may be correlated to the hash value and the original plaintext of the string. When later communications arrive with a hash value of 24, they may be compared to the first column shown in the table, and their accompanying value of 16 may be compared to this additional value to provide more confidence that the hash value is unique to this particular original textual string. As discussed above, other techniques may also be used to ensure that there are no collisions in the hash values, such as by returning an additional number or other representation that is generated by an alternative hash algorithm. In certain implementations, if thesecurity server system 106 identified that there may be a problem with a received hash value, thesecurity server system 106 may provide a special message to the responding computing device to trigger the responding computer device to transmit the original plaintext code instead of the hash value. - Other tables may store additional relationships that are of value in operating the
system 100. For example, one table may store identifiers for particular ones of thecomputing devices 102A-C, where a particular device may be identified by a cookie that it stores and passes to thesecurity server system 106. That device identifier may then be related to the variety of parameters, such as the user agent parameter just discussed, and additional parameters, which may include hardware identifiers, operating system identifiers, and software identifiers, among other things. By this mechanism then, thesystem 100 may correlate a particular device to particular configuration information and to configuration parameters reported by the device. - This particular example is highly simplified for purposes of clarity. In a typical implementation, many different webpages and other Web resources will be served by
system 100 to many different computing devices. Thus, a large number of different hash values will be received bysecurity server system 106 in an interleaved fashion with each other, and the system will need to correlate those hash values or other compressed values with particular original text represented by those values. Such multi-value implementation may occur, for example, by adding additional records to the simple table shown here, or by other appropriate techniques. - A server system can also specify a seed to be used before generating a random number, or specify another random number generation method (and the initial state of the pseudorandom number generator (PRNG), and the choice threshold value, such that the sequence of fields chosen will be known by the server. This can be used to force the client to generate an “uncompressed” value for a field that is unknown by the client. It can also be used to allow the server to have more control over the data flow (more or less data), and can even be used as a mechanism for determining when a malicious client is sending data in a non-compliant format, which could be used to determine that the client is, in fact, addled with malware.
- From time to time, hash values that have already been correlated with original text may also be tested by other incoming original text. For example, the
security server system 106 might not normally perform a hash on incoming original text if thesystem 106 has determined that there is already a correlation for that text in the table. - However, a random number approach similar to that used on the
computing devices 102A-C may be used so that thesecurity server system 106 periodically does perform such a hashing and comparison so as to confirm the accuracy of the data in the table. If thesystem 106 determines that there is an inaccuracy, because the hash value generated for an incoming string of text does not match a pre-existing hash value in the system for that text, thesystem 106 may generate an exception and alert an operator of thesystem 106. - Also, the example here is stated in terms of Web code being served to a general web browser. Other types of code may alternatively be served to other types of applications. In such situations, those other applications may be caused to choose whether to return original or compressed configuration information, or such decisions may be made by code separate from the applications but made in respect to the execution of the applications.
- Also, although the techniques discussed here have been associated with communications for the delivery of information related to browser environment and user or automated interactions with web pages, they may also, in appropriate circumstances, be applied more generally. For example, other data that is reported at periodic intervals and is common as between a substantial portion of those reporting events, may be compressed using the techniques here, and interpreted using uncompressed (or losslessly compressed) messages in some instances of the reporting, and lossy compressed messages corresponding to the same content in other instance of reporting. Various mechanisms, including those discussed above and below, may be used to identify that the compressed and uncompressed messages match each other in their content, and to then associate the compressed messages with the uncompressed content.
- In addition, while the techniques are described here as involving transmission of data to a server system from web code served to a browser, other techniques may also be used. For example, a stand-alone application for a particular organization may report information to a server system, and may be programmed to use the sometimes-compressed/sometimes-uncompressed techniques described here to transmit necessary data to the server system (particularly when the data is largely repetitive as between different reporting events for the data).
-
FIG. 2 is a schematic diagram of asystem 200 for performing deflection and detection of malicious activity with respect to a web server system. Thesystem 200 may be the same as thesystem 100 discussed with respect toFIG. 1 , and is shown in this example to better explain the interrelationship of various general features of theoverall system 200, including the use of the reporting of compressed and uncompressed versions of the same strings in order to conserve bandwidth (for compressed representations) and to determine what the compressed representations represent (for uncompressed representations). - The
system 200 in this example is a system that is operated by or for a large number of different businesses that serve web pages and other content over the internet, such as banks and retailers that have on-line presences (e.g., on-line stores, or on-line account management tools). The main server systems operated by those organizations or their agents are designated as web servers 204 a-204 n, and could include a broad array of web servers, content servers, database servers, financial servers, load balancers, and other necessary components (either as physical or virtual servers). - A set of
security server systems 202 a to 202 n are shown connected between theweb servers 204 a to 204 n and anetwork 210 such as the internet. Although both extend to n in number, the actual number of sub-systems could vary. For example, certain of the customers could install two separate security server systems to serve all of their web server systems (which could be one or more), such as for redundancy purposes. The particular security server systems 202 a-202 n may be matched to particular ones of the web server systems 204 a-204 n, or they may be at separate sites, and all of the web servers for various different customers may be provided with services by a single common set of security servers 202 a-202 n (e.g., when all of the server systems are at a single co-location facility so that bandwidth issues are minimized). - Each of the security server systems 202 a-202 n may be arranged and programmed to carry out operations like those discussed above and below and other operations. For example, a
policy engine 220 in each such security server system may evaluate HTTP requests from client computers (e.g., desktop, laptop, tablet, and smartphone computers) based on header and network information, and can set and store session information related to a relevant policy. Thepolicy engine 220 may be programmed to classify requests and correlate them to particular actions to be taken to code returned by the web server systems (for transmission to requesting clients) before such code is served back to a client computer. When such code returns, the policy information may be provided to a decode, analysis, andre-encode module 224, which matches the content to be delivered, across multiple content types (e.g., HTML, JavaScript, and CSS), to actions to be taken on the content (e.g., using XPATH within a DOM), such as substitutions, addition of content, and other actions that may be provided as extensions to the system. For example, the different types of content may be analyzed to determine naming that may extend across such different pieces of content (e.g., the name of a function or parameter), and such names may be changed in a way that differs each time the content is served, e.g., by replacing a named item with randomly-generated characters. Elements within the different types of content may also first be grouped as having a common effect on the operation of the code (e.g., if one element makes a call to another), and then may be re-encoded together in a common manner so that their interoperation with each other will be consistent even after the re-encoding. - A
rules engine 222 may store analytical rules for performing such analysis and for re-encoding of the content. Therules engine 222 may be populated with rules developed through operator observation of particular content types, such as by operators of a system studying typical web pages that call JavaScript content and recognizing that a particular method is frequently used in a particular manner. Such observation may result in therules engine 222 being programmed to identify the method and calls to the method so that they can all be grouped and re-encoded in a consistent and coordinated manner. - The decode, analysis, and
re-encode module 224 encodes content being passed to client computers from a web server according to relevant policies and rules. Themodule 224 also reverse encodes requests from the client computers to the relevant web server or servers. For example, a web page may be served with a particular parameter, and may refer to JavaScript that references that same parameter. The decode, analysis, andre-encode module 224 may replace the name of that parameter, in each of the different types of content, with a randomly generated name, and each time the web page is served (or at least in varying sessions), the generated name may be different. When the name of the parameter is passed back to the web server, it may be re-encoded back to its original name so that this portion of the security process may occur seamlessly for the web server. - A key for the function that encodes and decodes such strings can be maintained by the security server system 202 along with an identifier for the particular client computer so that the system 202 may know which key or function to apply, and may otherwise maintain a state for the client computer and its session. A stateless approach may also be employed, whereby the system 202 encrypts the state and stores it in a cookie that is saved at the relevant client computer. The client computer may then pass that cookie data back when it passes the information that needs to be decoded back to its original status. With the cookie data, the system 202 may use a private key to decrypt the state information and use that state information in real-time to decode the information from the client computer. Such a stateless implementation may create benefits such as less management overhead for the server system 202 (e.g., for tracking state, for storing state, and for performing clean-up of stored state information as sessions time out or otherwise end) and as a result, higher overall throughput.
- An
instrumentation module 226 is programmed to add instrumentation code to the content that is served from a web server. The instrumentation code is code that is programmed to monitor the operation of other code that is served. For example, the instrumentation code may be programmed to identify when certain methods are called, when those methods have been identified as likely to be called by malicious software. When such actions are observed to occur by the instrumentation code, the instrumentation code may be programmed to send a communication to the security server reporting on the type of action that occurred and other meta data that is helpful in characterizing the activity. Such information can be used to help determine whether the action was malicious or benign. - The instrumentation code may also analyze the DOM on a client computer in predetermined manners that are likely to identify the presence of and operation of malicious software, and to report to the security servers 202 or a related system. For example, the instrumentation code may be programmed to characterize a portion of the DOM when a user takes a particular action, such as clicking on a particular on-page button, so as to identify a change in the DOM before and after the click (where the click is expected to cause a particular change to the DOM if there is benign code operating with respect to the click, as opposed to malicious code operating with respect to the click).
- Data that characterizes the DOM may also be hashed, either at the client computer or the server system 202, to produce a representation of the DOM (e.g., in the differences between part of the DOM before and after a defined action occurs) that is easy to compare against corresponding representations of DOMs from other client computers.
- Other techniques may also be used by the instrumentation code to generate a compact representation of the DOM or other structure expected to be affected by malicious code in an identifiable manner.
- The
instrumentation module 226 or another component may also provide a user telemetry script or other code for causing the client device receiving the other code to communicate with the server system after the code is transmitted. Such additional code may include code that causes the client devices to return configuration information about themselves, and to control whether they return the information in a compressed or native state, in the manners described above. Themodule 226 may also generate and provide to the client devices a request frequency value that helps control how often the native text is transmitted back to the system instead of the compressed form of the text. One or more modules may also control the receipt of such configuration information, the storage of the information, and the correlation of the compressed data (e.g., being used as an index value for a table) and the corresponding original form of the data. - As noted, the content from web servers 204 a-204 n, as encoded by decode, analysis, and
re-encode module 224, may be rendered on web browsers of various client computers. Uninfected client computers 212 a-212 n represent computers that do not have malicious code programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. Infected client computers 214 a-214 n represent computers that do have malware, or malicious code (218 a-218 n, respectively), programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. In certain implementations, the client computers 212, 214 may also store the encrypted cookies discussed above and pass such cookies back through thenetwork 210. The client computers 212, 214 will, once they obtain the served content, implement DOMs for managing the displayed web pages, and instrumentation code may monitor the respective DOMs as discussed above. Reports of illogical activity (e.g., software on the client device calling a method that does not exist in the downloaded and rendered content) can then be reported back to the server system. - The reports from the instrumentation code may be analyzed and processed in various manners in order to determine how to respond to particular abnormal events, and to track down malicious code via analysis of multiple different similar interactions across different client computers 212, 214. For small-scale analysis, each web site operator may be provided with a
single security console 207 that provides analytical tools for a single site or group of sites. For example, theconsole 207 may include software for showing groups of abnormal activities, or reports that indicate the type of code served by the web site that generates the most abnormal activity. For example, a security officer for a bank may determine that defensive actions are needed if most of the reported abnormal activity for its web site relates to content elements corresponding to money transfer operations—an indication that stale malicious code may be trying to access such elements surreptitiously. - A
central security console 208 may connect to a large number of web content providers, and may be run, for example, by an organization that provides the software for operating the security server systems 202 a-202 n.Such console 208 may access complex analytical and data analysis tools, such as tools that identify clustering of abnormal activities across thousands of client computers and sessions, so that an operator of theconsole 208 can focus on those clusters in order to diagnose them as malicious or benign, and then take steps to thwart any malicious activity. - In certain other implementations, the
console 208 may have access to software for analyzing telemetry data received from a very large number of client computers that execute instrumentation code provided by thesystem 200. Such data may result from forms being re-written across a large number of web pages and web sites to include content that collects system information such as browser version, installed plug-ins, screen resolution, window size and position, operating system, network information, and the like. In addition, user interaction with served content may be characterized by such code, such as the speed with which a user interacts with a page, the path of a pointer over the page, and the like. The telemetry data may also include the received data that characterizes the then-current conditions of each of the client devices, such as the browser and operating systems that they were running, and other appropriate information. - Such collected telemetry data, across many thousands of sessions and client devices, may be used by the
console 208 to identify what is “natural” interaction with a particular page that is likely the result of legitimate human actions, and what is “unnatural” interaction that is likely the result of a bot interacting with the content. - Statistical and machine learning methods may be used to identify patterns in such telemetry data, and to resolve bot candidates to particular client computers. Such client computers may then be handled in special manners by the
system 200, may be blocked from interaction, or may have their operators notified that their computer is potentially running malicious software (e.g., by sending an e-mail to an account holder of a computer so that the malicious software cannot intercept it easily). -
FIG. 3 is a flow chart of a process for reducing bandwidth requirements between computers. In general, the process involves providing client computers with code that causes the computers to report back aspects of their operation. Different ones of the client computers are caused to report the information in compressed form, while others of the client devices are caused to report the same information in an original uncompressed, or plaintext form. The process can then use the combination of compressed and uncompressed reported information to correlate the compressed representations with the uncompressed representations, even though no particular computer or transmission provided such a correlation for the server system that served the code. The server system may make the correlation, for example, by performing a compression of received uncompressed code in a manner that matches the way that one or more of the client devices performed the compression of the same code or data. - The process begins at
box 302, where the server system serves Web code to a plurality of different client devices. The Web code may be code for a particular webpage, for multiple related webpages, or for various unrelated webpages associated with different websites, including websites from different domains. In certain implementations, the Web code may be recoded from what is initially served by a Web servers, such as by rewriting the names of particular functions or other elements in unpredictable manners but in a way that is consistent across all of the elements being served (e.g., so that the code does not break when executed and so that calls made to a particular function or other element are changed according to the changes made in the name of the element). - At
box 304, supplemental code is served by the system. The supplemental code may be served along with the Web code in a single transaction, or may be served separately. The supplemental code may include, for example, instrumentation code and telemetry code that causes the receiving client device to monitor the operation of the Web code that is served to the device and potentially to report back on such operation to a security server system, if the monitoring determines that anomalous activity is occurring on the client device. Other code may also be served, such as parameter values that may affect the way in which the supplemental code operates, such as a request frequency number described above, and other appropriate values. - At
box 306, the server system may have waited after serving both the Web code and the supplemental code, and may subsequently receive, from the client or clients to whom the code was served, hashed representations for configuration. Those representations may represent a variety of parameters that are relevant to the client devices from which they come, including identifiers for the current configuration state of a particular client device. The particular parameter may be identified, and the value of the identified may be identified by the hash code that one of the client devices generated by hashing the plaintext parameter value. A number of different parameters may be reported on for each client device, and even more parameters may be reported on across a universe of client devices. For example, Web code served from a certain webpage may be accompanied by instrumentation code that reports back on certain parameters of a device, while Web code served for another webpage may be accompanied by code that reports back on other parameters. - When the system receives such hashed representations, it may save them, as shown at
box 308, even though it does not at that time know what original values they represent. Such representations may also be associated with identifiers for client devices from which they were received, so that the particular configuration information for those devices may be determined later, even if it cannot be determined when the hashed representations are initially received. - At
box 310, plaintext representations are received from one or more client devices. The plaintext representations may have been transmitted by those client devices in response to the client devices executing instrumentation or telemetry code that instructed the transmission of such plaintext versions of the information to be transmitted (e.g., upon the client device choosing to transmit plaintext rather than a compressed representation). When the security system receives plaintext representations from telemetry code, it may be programmed to first compress those plaintext representations such as by hashing them. The compression may occur according to a mechanism that matches a known hashing mechanism to be operating on the client devices in cooperation with the instrumentation and telemetry code that was served to those client devices. - With the plaintext representations having been hashed, the security system will now have a correlation between a particular plaintext representation and a particular hash value. The system may then compare that hash value to any of the hashed values that have previously been received, at
box 314, and may then correlate whatever previously-received hash values were received to the plaintext representation that was later received, atbox 316. In certain examples, the initial transfer of a particular piece of data may be in plaintext form, so that the database would be populated with a plaintext representation and a hash representation simultaneously. Later transmissions of plaintext representations may simply be matched against the plaintext column of the database, and the devices that sent those plaintext representations may be correlated with the hashed value as an index value for those devices. Alternatively, the plaintext values that are later received may always be hashed, and the hashed values may be compared against the database if that is a more efficient operation of the system computationally. Also, periodically, plaintext representations and their hash values may be checked against the table to ensure that there are no errors in the data. In addition, other values that represent the plaintext may be transmitted along with the hashed representations of the plaintext so as to ensure that the system is not receiving overlapping hash values that match each other but that each represent different plaintext representations. - At
box 318, characteristics of infected computers are identified using information gleaned from the previous steps. For example, the hashed values may be used as data in statistical analysis techniques, such as techniques that may attempt to identify clusters of activity within a population of computers, such as a population of hundreds of thousands of computers. Clustering may indicate anomalous activities by those computers, and the hash values may then be used to determine what configuration information is possessed in common by computers within that cluster. As one example, the analysis may determine that a large majority of computers having anomalous behavior are running a recently released operating system or browser version (i.e., that anomalous behavior is clustered around a dimension associated with that particular value of the user agent parameter for a population of machines). Such a determination may be evidence of a vulnerability of such browser or operating system version to Mal Ware. An operator of the system described here may then act upon such information, such as to cause the browser or operating system to be updated or the security hole to otherwise be plugged. -
FIG. 4 is a swim lane diagram of a process for transferring data between client computers and a server system. In general, the process, like those discussed above, involves transmitting content to a server system, in most instances, in a compressed manner from which the identity of the original content cannot be determined (a lossy compression like forming a hash). In a small number of cases, the content can be transmitted in an uncompressed or losslessly compressed form, the received data may be compressed using a process equivalent to the process that was used by clients on the other received content, and the compressed form may be matched to the compressed forms received in that other received content. In this way, the original form of the other received content (both past and future) can be inferred. - The process begins at
box 402, where a client device requests a web page, such as via a GET or POST method. Such a request may be directed to a particular URL served by a web server system of a particular organization. The request may result in the web server system identifying appropriate code to respond to the request, which may include static code and dynamic code, and may take the form of HTML, CSS, and JavaScript, among others. Atbox 404, the web server system serves the responsive code. - The served code is intercepted at
box 406 by a security server system that, e.g., the operator of the web server system has added as an intermediary for providing security for the web server system. For example, a third party may provide a security system that can be added modularly to a company's web server system without having to affect the web server system in any substantial manner. In other implementations, the intermediary functionality may be integrated in the web server system. Also, the intermediary server system may be physically location within the same building as the web server system (for minimizing latency and maximizing the ability to coordinate systems) or in a separate location that requires communication through a network, including the Internet, - At
box 406, the security server system intercepts the code and modifies it. For example, as described above, the names of certain functions may be changed in a sufficiently random or arbitrary manner that the new names cannot be anticipated by malware running on the clients. The changes may be coordinated across different types of code (e.g., HTML, CSS, and JavaScript) where the names occur, so that the code functions the same as the code it replaced. Generally, the changes are made to latent code whose operation a user does not see, and static code. - At
box 408, the code is appended with monitoring and reporting code. Such code may monitor the DOM that is created on the client when the served code is rendered, or may monitor attempts to interact with the code, and may characterize and report any abnormal activity. Such code may also report other status information about a client, such as configuration information that describes the features of the client system. In certain situations, a complete picture of what is occurring in the browser or other application (e.g., a specific app programmed for the company that serves the code). The reporting code may in particular include code for making a determination whether to report particular information in a compressed versus an uncompressed form, and then to transmit the data back to the server system accordingly. - At
box 410, the client renders the web page by executing the various types of served code, and perhaps by acquiring code form other sources in addition to the code that was initially served by the web server system (whether from the organization that operates the web server system or from one or more other organizations). As described throughout this process, the serving and executing of code described here would be repeated across thousands or more different client devices that may each vary in different ways, such as by having different base (the basic computer) and extended hardware (e.g., added graphics cards or RAM), operating systems, installed and executing applications, and executing browser plug ins. Thus, each rendering of the web page may be performed in a different manner for different ones of the client devices, and even for the same client device in different sessions. - At
box 412, the client device generates characterization and activity data that is to be sent back to the server systems. The box is labeled with a “1” to indicate that this step represents a subset of the devices that are served the web code, and are the devices that hash the data that is to be reported so as to lower the bandwidth required for such reporting. Generally, the vast majority of instances would be established to report in such a manner so as to significantly reduce the overhead of transmitting the data. - In this example, characterization data represents status of the client device, such as hardware and software on the device, whereas activity data represents actions that have occurred on the device, particularly since the device received the served web code (e.g., activities between the served code and other code that is on the device). The characterization data may be sent to one server system, while the activity data may be sent to another, or they may be sent to the same server system. Also, certain data may be sent according to the compressed/uncompressed scheme described in this document when the data is expected to be common across many devices, so that the original value of the content for devices that compress their content can be inferred from the uncompressed content (where, unless otherwise noted, uncompressed content includes content whose original form can be determined by a server system that receives it, and thus includes losslessly compressed content). Other data may be sent in a normal manner, without the pairing of compressed/uncompressed transmission, such as where the content is not typically common as among different machines, so that there would be relatively little value in trying to infer the original content from transmissions made by other machines.
- At
box 414, an analysis system receives the reported data, which may include activity data. The analysis system may use such activity data to identify that certain normal or anomalous activities have occurred on a certain device, and may conduct analysis on similar activity data received from a large number of other devices to identify clusters of common activity so as to determine that malware is taking advantage of such devices. The analysis system may also be provided with characterization data so that it can determine characteristics of the devices that are being affected by the malware. - Separately, or as part of the same communication, the client device may provide similar data to the security server system, as indicated at
box 416. The security server system may then associate the particular client device with the hashed forms of the compressed content that is sent (as the analysis system may do if it receives only hashed data). At this point in the example process, the security server system has received no unhashed form of the content, so it does not know what the original form of the content was. As a result, the system may simply associate an identifier for the particular device with the hashed form of the received content (or may simply index upward a count of the number of clients reporting the content of the particular form of hash). In this example, multiple different fields may be reported in a hashed manner, such as one or more fields that identify hardware for a device, and one or more fields that identify software executing on the device. Each feature of the device (e.g., make and model, operating system, amount of RAM, etc.) may receive its own hash, or groups of features may receive a single hash—where each hash is selected so as to cover content that is likely to be common across many devices, so that the hash value may be readily reverse-engineered when an uncompressed version of the content is received from another device. - At
box 418, another client (indicated by the circled “2”) also generates and reports characterization and activity data. In this instance, the particular device does not compress the content that it reports—e.g., because it selected a number pseudo-randomly that does not exceed a predetermined level that was provided with the web page code. The analysis system may receive at least some of the generated content at box 420 (which may be the same content as received atbox 414 or may contain some fields whose parameters are the same as those received at box 414), though here the content would be received in uncompressed form (e.g., either as plaintext or in a losslessly compressed format). To the extent the analysis system previously received parameters for certain fields in compressed format, it may compress the received uncompressed content to form a hash value and may then compare it to compressed content that was previously received. If the hash value matches a hash value storedform Box 414, then the original content may be associated by the system with the other devices that previously reported the hashed value, as may future devices that report the hashed value. Alternatively, or in addition, the analysis system can add to a number of devices that have reported as having the particular parameter. - Similarly, the second client device can report the characterization and activity data to the security server system, and at
box 422, that system can generate a hash value for it. As with the analysis system, certain other fields may have been reported in hashed form or may always be reported by all devices in uncompressed form. - At
box 424, the security server system associates the particular parameters received frombox 412 with the other instances of reporting the same content (as determined by comparing the just-generated hash value with previously-received hash values form the other devices). - In situations where the analysis system does not separately track associations between particular device IDs and content reported by those devices, the analysis system can request ID and parameter data (box 426) from the security server system. The security server system (box 428) may gather and transmit such data, and the analysis system may identify common features of anomalously-acting machines (box 430) using such data. In other words, in one implementation, the analysis system may receive activity data and use such data to identify clusters of common activity, or otherwise identify potential problems that arise in the operation of a number of different client devices. At the time of such initial analysis, the analysis system may not know the characterization data for the devices, and may only seek such data from the security server system after identifying the problem. Such follow-up information gathering may then be used by the analysis system to identify features of the devices that are determined to be acting anomalously, such as by determining that they all are executing the same browser program, and perhaps a common version or range of versions of that program. In yet other embodiments, the analysis system may repeat operations that are performed by the security server system, such as in the inferring of the original content of compressed messages via compressing of received uncompressed messages.
- Also, the security server system and the analysis server system may be part of the same system or separate systems. For example, a retailer may manage both systems along with a web server system. In another example, a third-party may operate the analysis server system from its own facility, and can assist customers with operating their particular security server systems on their premises, with their web server systems. The third-party may aggregate activity over a large number of served content in such manner, and may more readily identify anomalous behavior than could a single organization serving only a fraction of such content.
-
FIGS. 5A and 5B show, respectively, state diagrams for a client and a server operating according to the mechanisms described above. Referring specifically to the client encoding state machine ofFIG. 5A , atbox 502, the client device begins its operations by which it prepares information for transmission to the server system and performs the transmission. Atbox 504, a determination is made whether more fields need to be encoded for transmission. If not, then the client waits until a next time that processing and transmission is needed. - At
box 506, if more fields need to be transmitted to the server, a random number (which may be pseudorandom or otherwise less than exactly random) is generated atbox 506. The number may be generated for each overall transmission or for each field within a transmission (so that some field values may be compressed and some not). Atbox 508, the client determines whether the generated number exceeds a threshold. That threshold may be a predetermined value that is relatively permanent and stored by the client for a long time, or may be highly variable, where the threshold is transmitted with code recently received by the client, or is accessed at run time by the client (e.g., by submitting a GET function to a remote server system). If the generated number exceeds the threshold, then the client sends a raw version of the relevant field, such as in plaintext or losslessly compressed form of the content for the field (box 512). If the threshold is not exceeded, then a hashed version of the content is sent (box 510). Of course, the determination may be made inversely, so that the hashed form is sent if the threshold is exceeded (and/or matched), and the raw data is sent if it is not (and/or is matched). - Referring now to
FIG. 5B , there is shown a state diagram for a server that interacts with the operations of the client just described. Atbox 520, the process begins, such as by the server determining that it has received data for a plurality of fields, where the data needs to be interpreted by the server system. If there are no more fields to process, the system returns to a rest state, but if there are, then the server analyzes the next field in line and determines whether it is in raw form or hash form (box 524). If it is in raw form, then the server hashes the field using a hashing technique that matches a technique that the server knows to be performed by various clients that are reporting data to it (box 532). The server then associates the hash result with the raw content (box 534). The system can then use such a correlation between the hash result and the raw data to interpret other communications form other clients that contain only the hash result. In particular, the system can use the correlation to infer what the raw form at the client was when only the hash form is received. - If the field is not in raw form (is in hash form), the system performs a lookup on the hash form (box 526). The system determines whether the hash form of the field is found in the system (box 528), so as to indicate that a correlation has already been stored between the hash form and the raw form. If the hash form is found, then the system can get the raw value a(box 530) and act accordingly. If the field is not found (e.g., because the value for the field has not previously been received in raw form), then the occurrence of the receipt of the hash form from the client may be saved and noted, and the system may return to check if additional fields need processing.
-
FIG. 6 is a schematic diagram of acomputer system 600. Thesystem 600 can be used for the operations described in association with any of the computer-implement methods described previously, according to one implementation. Thesystem 600 is intended to include various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Thesystem 600 can also include mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices. Additionally the system can include portable storage media, such as, Universal Serial Bus (USB) flash drives. For example, the USB flash drives may store operating systems and other applications. The USB flash drives can include input/output components, such as a wireless transmitter or USB connector that may be inserted into a USB port of another computing device. - The
system 600 includes aprocessor 610, a memory 620, astorage device 630, and an input/output device 640. Each of thecomponents system bus 650. Theprocessor 610 is capable of processing instructions for execution within thesystem 600. The processor may be designed using any of a number of architectures. For example, theprocessor 610 may be a CISC (Complex Instruction Set Computers) processor, a RISC (Reduced Instruction Set Computer) processor, or a MISC (Minimal Instruction Set Computer) processor. - In one implementation, the
processor 610 is a single-threaded processor. In another implementation, theprocessor 610 is a multi-threaded processor. Theprocessor 610 is capable of processing instructions stored in the memory 620 or on thestorage device 630 to display graphical information for a user interface on the input/output device 640. - The memory 620 stores information within the
system 600. In one implementation, the memory 620 is a computer-readable medium. In one implementation, the memory 620 is a volatile memory unit. In another implementation, the memory 620 is a non-volatile memory unit. - The
storage device 630 is capable of providing mass storage for thesystem 600. In one implementation, thestorage device 630 is a computer-readable medium. In various different implementations, thestorage device 630 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device. - The input/output device 640 provides input/output operations for the
system 600. In one implementation, the input/output device 640 includes a keyboard and/or pointing device. In another implementation, the input/output device 640 includes a display unit for displaying graphical user interfaces. - The features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output. The described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- To provide for interaction with a user, the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer. Additionally, such activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
- The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), peer-to-peer networks (having ad-hoc or static members), grid computing infrastructures, and the Internet.
- The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network, such as the described one. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular implementations of particular inventions. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
- Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumfstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
- Thus, particular implementations of the subject matter have been described. Other implementations are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous.
Claims (19)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/236,566 US20190140835A1 (en) | 2014-01-21 | 2018-12-30 | Blind Hash Compression |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/160,107 US9225729B1 (en) | 2014-01-21 | 2014-01-21 | Blind hash compression |
US16/236,566 US20190140835A1 (en) | 2014-01-21 | 2018-12-30 | Blind Hash Compression |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/160,107 Continuation US9225729B1 (en) | 2014-01-21 | 2014-01-21 | Blind hash compression |
Publications (1)
Publication Number | Publication Date |
---|---|
US20190140835A1 true US20190140835A1 (en) | 2019-05-09 |
Family
ID=54932554
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/160,107 Active 2034-04-20 US9225729B1 (en) | 2014-01-21 | 2014-01-21 | Blind hash compression |
US14/980,231 Active US10212137B1 (en) | 2014-01-21 | 2015-12-28 | Blind hash compression |
US16/236,566 Abandoned US20190140835A1 (en) | 2014-01-21 | 2018-12-30 | Blind Hash Compression |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/160,107 Active 2034-04-20 US9225729B1 (en) | 2014-01-21 | 2014-01-21 | Blind hash compression |
US14/980,231 Active US10212137B1 (en) | 2014-01-21 | 2015-12-28 | Blind hash compression |
Country Status (1)
Country | Link |
---|---|
US (3) | US9225729B1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10382482B2 (en) | 2015-08-31 | 2019-08-13 | Shape Security, Inc. | Polymorphic obfuscation of executable code |
US10536479B2 (en) | 2013-03-15 | 2020-01-14 | Shape Security, Inc. | Code modification for automation detection |
US10554777B1 (en) | 2014-01-21 | 2020-02-04 | Shape Security, Inc. | Caching for re-coding techniques |
US11088995B2 (en) | 2013-12-06 | 2021-08-10 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US11579985B2 (en) * | 2019-05-31 | 2023-02-14 | Acronis International Gmbh | System and method of preventing malware reoccurrence when restoring a computing device using a backup image |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9225737B2 (en) | 2013-03-15 | 2015-12-29 | Shape Security, Inc. | Detecting the introduction of alien content |
US8869281B2 (en) | 2013-03-15 | 2014-10-21 | Shape Security, Inc. | Protecting against the introduction of alien content |
US9338143B2 (en) | 2013-03-15 | 2016-05-10 | Shape Security, Inc. | Stateless web content anti-automation |
US8954583B1 (en) | 2014-01-20 | 2015-02-10 | Shape Security, Inc. | Intercepting and supervising calls to transformed operations and objects |
US9225729B1 (en) | 2014-01-21 | 2015-12-29 | Shape Security, Inc. | Blind hash compression |
US9608822B2 (en) * | 2014-03-18 | 2017-03-28 | Ecole Polytechnique Federale De Lausanne (Epfl) | Method for generating an HTML document that contains encrypted files and the code necessary for decrypting them when a valid passphrase is provided |
US8997226B1 (en) | 2014-04-17 | 2015-03-31 | Shape Security, Inc. | Detection of client-side malware activity |
US20150310218A1 (en) * | 2014-04-28 | 2015-10-29 | Verify Brand, Inc. | Systems and methods for secure distribution of codes |
US9075990B1 (en) | 2014-07-01 | 2015-07-07 | Shape Security, Inc. | Reliable selection of security countermeasures |
US9825984B1 (en) | 2014-08-27 | 2017-11-21 | Shape Security, Inc. | Background analysis of web content |
US10298599B1 (en) | 2014-09-19 | 2019-05-21 | Shape Security, Inc. | Systems for detecting a headless browser executing on a client computer |
US9954893B1 (en) | 2014-09-23 | 2018-04-24 | Shape Security, Inc. | Techniques for combating man-in-the-browser attacks |
US9813440B1 (en) | 2015-05-15 | 2017-11-07 | Shape Security, Inc. | Polymorphic treatment of annotated content |
US9986058B2 (en) | 2015-05-21 | 2018-05-29 | Shape Security, Inc. | Security systems for mitigating attacks from a headless browser executing on a client computer |
JP6428936B2 (en) * | 2015-06-10 | 2018-11-28 | 富士通株式会社 | Information processing apparatus, information processing method, and information processing program |
WO2017007705A1 (en) | 2015-07-06 | 2017-01-12 | Shape Security, Inc. | Asymmetrical challenges for web security |
WO2017007936A1 (en) | 2015-07-07 | 2017-01-12 | Shape Security, Inc. | Split serving of computer code |
US10375026B2 (en) | 2015-10-28 | 2019-08-06 | Shape Security, Inc. | Web transaction status tracking |
US10212130B1 (en) | 2015-11-16 | 2019-02-19 | Shape Security, Inc. | Browser extension firewall |
US10326790B2 (en) | 2016-02-12 | 2019-06-18 | Shape Security, Inc. | Reverse proxy computer: deploying countermeasures in response to detecting an autonomous browser executing on a client computer |
US10855696B2 (en) | 2016-03-02 | 2020-12-01 | Shape Security, Inc. | Variable runtime transpilation |
US9917850B2 (en) | 2016-03-03 | 2018-03-13 | Shape Security, Inc. | Deterministic reproduction of client/server computer state or output sent to one or more client computers |
US10567363B1 (en) | 2016-03-03 | 2020-02-18 | Shape Security, Inc. | Deterministic reproduction of system state using seeded pseudo-random number generators |
US10129289B1 (en) | 2016-03-11 | 2018-11-13 | Shape Security, Inc. | Mitigating attacks on server computers by enforcing platform policies on client computers |
US10275596B1 (en) * | 2016-12-15 | 2019-04-30 | Symantec Corporation | Activating malicious actions within electronic documents |
US11736299B2 (en) * | 2019-01-18 | 2023-08-22 | Prometheus8 | Data access control for edge devices using a cryptographic hash |
US20200326892A1 (en) * | 2019-04-10 | 2020-10-15 | Microsoft Technology Licensing, Llc | Methods for encrypting and updating virtual disks |
CN109947388B (en) * | 2019-04-15 | 2020-10-02 | 腾讯科技(深圳)有限公司 | Page playing and reading control method and device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6343313B1 (en) * | 1996-03-26 | 2002-01-29 | Pixion, Inc. | Computer conferencing system with real-time multipoint, multi-speed, multi-stream scalability |
US20060085541A1 (en) * | 2004-10-19 | 2006-04-20 | International Business Machines Corporation | Facilitating optimization of response time in computer networks |
US7051362B2 (en) * | 2000-05-16 | 2006-05-23 | Ideaflood, Inc. | Method and system for operating a network server to discourage inappropriate use |
US8855143B1 (en) * | 2005-04-21 | 2014-10-07 | Joseph Acampora | Bandwidth saving system and method for communicating self describing messages over a network |
Family Cites Families (191)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5509076A (en) | 1994-05-02 | 1996-04-16 | General Instrument Corporation Of Delaware | Apparatus for securing the integrity of a functioning system |
CN100452071C (en) | 1995-02-13 | 2009-01-14 | 英特特拉斯特技术公司 | Systems and methods for secure transaction management and electronic rights protection |
US8225408B2 (en) | 1997-11-06 | 2012-07-17 | Finjan, Inc. | Method and system for adaptive rule-based content scanners |
US7975305B2 (en) | 1997-11-06 | 2011-07-05 | Finjan, Inc. | Method and system for adaptive rule-based content scanners for desktop computers |
SE512672C2 (en) | 1998-06-12 | 2000-04-17 | Ericsson Telefon Ab L M | Procedure and system for transferring a cookie |
US6697948B1 (en) | 1999-05-05 | 2004-02-24 | Michael O. Rabin | Methods and apparatus for protecting information |
US7430670B1 (en) | 1999-07-29 | 2008-09-30 | Intertrust Technologies Corp. | Software self-defense systems and methods |
US7107347B1 (en) | 1999-11-15 | 2006-09-12 | Fred Cohen | Method and apparatus for network deception/emulation |
US7058699B1 (en) | 2000-06-16 | 2006-06-06 | Yahoo! Inc. | System and methods for implementing code translations that enable persistent client-server communication via a proxy |
US6938170B1 (en) | 2000-07-17 | 2005-08-30 | International Business Machines Corporation | System and method for preventing automated crawler access to web-based data sources using a dynamic data transcoding scheme |
US7117239B1 (en) | 2000-07-28 | 2006-10-03 | Axeda Corporation | Reporting the state of an apparatus to a remote computer |
US7398553B1 (en) | 2000-10-30 | 2008-07-08 | Tread Micro, Inc. | Scripting virus scan engine |
US7171443B2 (en) | 2001-04-04 | 2007-01-30 | Prodigy Communications, Lp | Method, system, and software for transmission of information |
WO2002088951A1 (en) | 2001-04-26 | 2002-11-07 | Telefonaktiebolaget Lm Ericsson (Publ) | Stateless server |
US7028305B2 (en) | 2001-05-16 | 2006-04-11 | Softricity, Inc. | Operating system abstraction and protection layer |
US20020199116A1 (en) | 2001-06-25 | 2002-12-26 | Keith Hoene | System and method for computer network virus exclusion |
US7010779B2 (en) | 2001-08-16 | 2006-03-07 | Knowledge Dynamics, Inc. | Parser, code generator, and data calculation and transformation engine for spreadsheet calculations |
CA2359831A1 (en) | 2001-10-24 | 2003-04-24 | Ibm Canada Limited-Ibm Canada Limitee | Method and system for multiple level parsing |
US6654707B2 (en) | 2001-12-28 | 2003-11-25 | Dell Products L.P. | Performing diagnostic tests of computer devices while operating system is running |
WO2003067405A2 (en) | 2002-02-07 | 2003-08-14 | Empirix Inc. | Automated security threat testing of web pages |
US7114160B2 (en) | 2002-04-17 | 2006-09-26 | Sbc Technology Resources, Inc. | Web content customization via adaptation Web services |
US7117429B2 (en) | 2002-06-12 | 2006-10-03 | Oracle International Corporation | Methods and systems for managing styles electronic documents |
JP4093012B2 (en) | 2002-10-17 | 2008-05-28 | 日本電気株式会社 | Hypertext inspection apparatus, method, and program |
US9009084B2 (en) | 2002-10-21 | 2015-04-14 | Rockwell Automation Technologies, Inc. | System and methodology providing automation security analysis and network intrusion protection in an industrial environment |
US20050216770A1 (en) | 2003-01-24 | 2005-09-29 | Mistletoe Technologies, Inc. | Intrusion detection system |
US7735144B2 (en) | 2003-05-16 | 2010-06-08 | Adobe Systems Incorporated | Document modification detection and prevention |
US7500099B1 (en) | 2003-05-16 | 2009-03-03 | Microsoft Corporation | Method for mitigating web-based “one-click” attacks |
WO2004109532A1 (en) | 2003-06-05 | 2004-12-16 | Cubicice (Pty) Ltd | A method of collecting data regarding a plurality of web pages visited by at least one user |
US7707634B2 (en) | 2004-01-30 | 2010-04-27 | Microsoft Corporation | System and method for detecting malware in executable scripts according to its functionality |
US20050198099A1 (en) | 2004-02-24 | 2005-09-08 | Covelight Systems, Inc. | Methods, systems and computer program products for monitoring protocol responses for a server application |
US7424720B2 (en) | 2004-03-25 | 2008-09-09 | International Business Machines Corporation | Process and implementation for dynamically determining probe enablement using out of process correlating token |
US7587537B1 (en) | 2007-11-30 | 2009-09-08 | Altera Corporation | Serializer-deserializer circuits formed from input-output circuit registers |
US7519621B2 (en) | 2004-05-04 | 2009-04-14 | Pagebites, Inc. | Extracting information from Web pages |
US20060101047A1 (en) | 2004-07-29 | 2006-05-11 | Rice John R | Method and system for fortifying software |
WO2006023948A2 (en) | 2004-08-24 | 2006-03-02 | Washington University | Methods and systems for content detection in a reconfigurable hardware |
US8181104B1 (en) | 2004-08-31 | 2012-05-15 | Adobe Systems Incorporated | Automatic creation of cascading style sheets |
US7480385B2 (en) | 2004-11-05 | 2009-01-20 | Cable Television Laboratories, Inc. | Hierarchical encryption key system for securing digital media |
US8850565B2 (en) | 2005-01-10 | 2014-09-30 | Hewlett-Packard Development Company, L.P. | System and method for coordinating network incident response activities |
US8365293B2 (en) | 2005-01-25 | 2013-01-29 | Redphone Security, Inc. | Securing computer network interactions between entities with authorization assurances |
US8281401B2 (en) | 2005-01-25 | 2012-10-02 | Whitehat Security, Inc. | System for detecting vulnerabilities in web applications using client-side application interfaces |
US20060230288A1 (en) | 2005-03-29 | 2006-10-12 | International Business Machines Corporation | Source code classification method for malicious code detection |
US20090070459A1 (en) | 2005-04-18 | 2009-03-12 | Cho Young H | High-Performance Context-Free Parser for Polymorphic Malware Detection |
US7707223B2 (en) | 2005-04-28 | 2010-04-27 | Cisco Technology, Inc. | Client-side java content transformation |
US20060288418A1 (en) | 2005-06-15 | 2006-12-21 | Tzu-Jian Yang | Computer-implemented method with real-time response mechanism for detecting viruses in data transfer on a stream basis |
US9467462B2 (en) | 2005-09-15 | 2016-10-11 | Hewlett Packard Enterprise Development Lp | Traffic anomaly analysis for the detection of aberrant network code |
US7770185B2 (en) | 2005-09-26 | 2010-08-03 | Bea Systems, Inc. | Interceptor method and system for web services for remote portlets |
US20070088955A1 (en) | 2005-09-28 | 2007-04-19 | Tsern-Huei Lee | Apparatus and method for high speed detection of undesirable data content |
US8195953B1 (en) | 2005-10-25 | 2012-06-05 | Trend Micro, Inc. | Computer program with built-in malware protection |
US8170020B2 (en) | 2005-12-08 | 2012-05-01 | Microsoft Corporation | Leveraging active firewalls for network intrusion detection and retardation of attack |
US8086756B2 (en) | 2006-01-25 | 2011-12-27 | Cisco Technology, Inc. | Methods and apparatus for web content transformation and delivery |
US20080208785A1 (en) | 2006-03-30 | 2008-08-28 | Pegasystems, Inc. | User interface methods and apparatus for rules processing |
US8407482B2 (en) | 2006-03-31 | 2013-03-26 | Avaya Inc. | User session dependent URL masking |
US8601064B1 (en) | 2006-04-28 | 2013-12-03 | Trend Micro Incorporated | Techniques for defending an email system against malicious sources |
US7849502B1 (en) | 2006-04-29 | 2010-12-07 | Ironport Systems, Inc. | Apparatus for monitoring network traffic |
GB0620855D0 (en) | 2006-10-19 | 2006-11-29 | Dovetail Software Corp Ltd | Data processing apparatus and method |
US8290800B2 (en) | 2007-01-30 | 2012-10-16 | Google Inc. | Probabilistic inference of site demographics from aggregate user internet usage and source demographic information |
WO2008095018A2 (en) | 2007-01-31 | 2008-08-07 | Omniture, Inc. | Page grouping for site traffic analysis reports |
US20080222736A1 (en) | 2007-03-07 | 2008-09-11 | Trusteer Ltd. | Scrambling HTML to prevent CSRF attacks and transactional crimeware attacks |
US7870610B1 (en) | 2007-03-16 | 2011-01-11 | The Board Of Directors Of The Leland Stanford Junior University | Detection of malicious programs |
CN101276362B (en) | 2007-03-26 | 2011-05-11 | 国际商业机器公司 | Apparatus and method for customizing web page |
EP2149093A4 (en) | 2007-04-17 | 2010-05-05 | Kenneth Tola | Unobtrusive methods and systems for collecting information transmitted over a network |
US7895653B2 (en) | 2007-05-31 | 2011-02-22 | International Business Machines Corporation | Internet robot detection for network distributable markup |
US8181246B2 (en) | 2007-06-20 | 2012-05-15 | Imperva, Inc. | System and method for preventing web frauds committed using client-scripting attacks |
EP2165499B1 (en) | 2007-06-22 | 2013-01-30 | Gemalto SA | A method of preventing web browser extensions from hijacking user information |
US20090007243A1 (en) | 2007-06-27 | 2009-01-01 | Trusteer Ltd. | Method for rendering password theft ineffective |
US8510431B2 (en) | 2007-07-13 | 2013-08-13 | Front Porch, Inc. | Method and apparatus for internet traffic monitoring by third parties using monitoring implements transmitted via piggybacking HTTP transactions |
US8689330B2 (en) | 2007-09-05 | 2014-04-01 | Yahoo! Inc. | Instant messaging malware protection |
HUE044989T2 (en) | 2007-09-07 | 2019-12-30 | Dis Ent Llc | Software based multi-channel polymorphic data obfuscation |
US7941382B2 (en) | 2007-10-12 | 2011-05-10 | Microsoft Corporation | Method of classifying and active learning that ranks entries based on multiple scores, presents entries to human analysts, and detects and/or prevents malicious behavior |
US9509714B2 (en) | 2014-05-22 | 2016-11-29 | Cabara Software Ltd. | Web page and web browser protection against malicious injections |
US8260845B1 (en) | 2007-11-21 | 2012-09-04 | Appcelerator, Inc. | System and method for auto-generating JavaScript proxies and meta-proxies |
US8347396B2 (en) | 2007-11-30 | 2013-01-01 | International Business Machines Corporation | Protect sensitive content for human-only consumption |
US8849985B1 (en) | 2007-12-03 | 2014-09-30 | Appcelerator, Inc. | On-the-fly instrumentation of Web applications, Web-pages or Web-sites |
CN101471818B (en) | 2007-12-24 | 2011-05-04 | 北京启明星辰信息技术股份有限公司 | Detection method and system for malevolence injection script web page |
US8646067B2 (en) | 2008-01-26 | 2014-02-04 | Citrix Systems, Inc. | Policy driven fine grain URL encoding mechanism for SSL VPN clientless access |
US20090192848A1 (en) | 2008-01-30 | 2009-07-30 | Gerald Rea | Method and apparatus for workforce assessment |
US8387139B2 (en) | 2008-02-04 | 2013-02-26 | Microsoft Corporation | Thread scanning and patching to disable injected malware threats |
US20090241174A1 (en) | 2008-02-19 | 2009-09-24 | Guru Rajan | Handling Human Detection for Devices Connected Over a Network |
US8650648B2 (en) | 2008-03-26 | 2014-02-11 | Sophos Limited | Method and system for detecting restricted content associated with retrieved content |
US9317255B2 (en) | 2008-03-28 | 2016-04-19 | Microsoft Technology Licensing, LCC | Automatic code transformation with state transformer monads |
CA2630388A1 (en) | 2008-05-05 | 2009-11-05 | Nima Sharifmehr | Apparatus and method to prevent man in the middle attack |
US8086957B2 (en) | 2008-05-21 | 2011-12-27 | International Business Machines Corporation | Method and system to selectively secure the display of advertisements on web browsers |
KR100987354B1 (en) | 2008-05-22 | 2010-10-12 | 주식회사 이베이지마켓 | System for checking false code in website and Method thereof |
US8762962B2 (en) | 2008-06-16 | 2014-06-24 | Beek Fund B.V. L.L.C. | Methods and apparatus for automatic translation of a computer program language code |
US8453126B1 (en) | 2008-07-30 | 2013-05-28 | Dulles Research LLC | System and method for converting base SAS runtime macro language scripts to JAVA target language |
US8200958B2 (en) | 2008-10-03 | 2012-06-12 | Limelight Networks, Inc. | Content delivery network encryption |
US8677481B1 (en) | 2008-09-30 | 2014-03-18 | Trend Micro Incorporated | Verification of web page integrity |
US7953850B2 (en) | 2008-10-03 | 2011-05-31 | Computer Associates Think, Inc. | Monitoring related content requests |
US8020193B2 (en) | 2008-10-20 | 2011-09-13 | International Business Machines Corporation | Systems and methods for protecting web based applications from cross site request forgery attacks |
US8434068B2 (en) | 2008-10-23 | 2013-04-30 | XMOS Ltd. | Development system |
US20100106611A1 (en) | 2008-10-24 | 2010-04-29 | Uc Group Ltd. | Financial transactions systems and methods |
US8526306B2 (en) | 2008-12-05 | 2013-09-03 | Cloudshield Technologies, Inc. | Identification of patterns in stateful transactions |
US8225401B2 (en) | 2008-12-18 | 2012-07-17 | Symantec Corporation | Methods and systems for detecting man-in-the-browser attacks |
CN101788982B (en) | 2009-01-22 | 2013-03-06 | 国际商业机器公司 | Method of cross-domain interaction and for protecting Web application in unmodified browser and system thereof |
CN101482882A (en) | 2009-02-17 | 2009-07-15 | 阿里巴巴集团控股有限公司 | Method and system for cross-domain treatment of COOKIE |
US8413239B2 (en) | 2009-02-22 | 2013-04-02 | Zscaler, Inc. | Web security via response injection |
AU2010223925A1 (en) | 2009-03-13 | 2011-11-03 | Rutgers, The State University Of New Jersey | Systems and methods for the detection of malware |
US20100240449A1 (en) | 2009-03-19 | 2010-09-23 | Guy Corem | System and method for controlling usage of executable code |
US9311425B2 (en) | 2009-03-31 | 2016-04-12 | Qualcomm Incorporated | Rendering a page using a previously stored DOM associated with a different page |
US8838628B2 (en) | 2009-04-24 | 2014-09-16 | Bonnie Berger Leighton | Intelligent search tool for answering clinical queries |
US9336191B2 (en) | 2009-05-05 | 2016-05-10 | Suboti, Llc | System, method and computer readable medium for recording authoring events with web page content |
US8332952B2 (en) | 2009-05-22 | 2012-12-11 | Microsoft Corporation | Time window based canary solutions for browser security |
US8527774B2 (en) | 2009-05-28 | 2013-09-03 | Kaazing Corporation | System and methods for providing stateless security management for web applications using non-HTTP communications protocols |
WO2010143152A2 (en) | 2009-06-10 | 2010-12-16 | Site Black Box Ltd | Identifying bots |
US8924943B2 (en) | 2009-07-17 | 2014-12-30 | Ebay Inc. | Browser emulator system |
US8438312B2 (en) | 2009-10-23 | 2013-05-07 | Moov Corporation | Dynamically rehosting web content |
US8775818B2 (en) | 2009-11-30 | 2014-07-08 | Red Hat, Inc. | Multifactor validation of requests to thwart dynamic cross-site attacks |
WO2011073982A1 (en) | 2009-12-15 | 2011-06-23 | Seeker Security Ltd. | Method and system of runtime analysis |
US9015686B2 (en) | 2009-12-21 | 2015-04-21 | Oracle America, Inc. | Redundant run-time type information removal |
US8739284B1 (en) | 2010-01-06 | 2014-05-27 | Symantec Corporation | Systems and methods for blocking and removing internet-traversing malware |
US8660976B2 (en) | 2010-01-20 | 2014-02-25 | Microsoft Corporation | Web content rewriting, including responses |
CA2694326A1 (en) | 2010-03-10 | 2010-05-18 | Ibm Canada Limited - Ibm Canada Limitee | A method and system for preventing cross-site request forgery attacks on a server |
US20110231305A1 (en) | 2010-03-19 | 2011-09-22 | Visa U.S.A. Inc. | Systems and Methods to Identify Spending Patterns |
US9634993B2 (en) | 2010-04-01 | 2017-04-25 | Cloudflare, Inc. | Internet-based proxy service to modify internet responses |
EP2558957A2 (en) | 2010-04-12 | 2013-02-20 | Google, Inc. | Scrolling in large hosted data set |
US8561193B1 (en) | 2010-05-17 | 2013-10-15 | Symantec Corporation | Systems and methods for analyzing malware |
US9646140B2 (en) | 2010-05-18 | 2017-05-09 | ServiceSource | Method and apparatus for protecting online content by detecting noncompliant access patterns |
US8739150B2 (en) | 2010-05-28 | 2014-05-27 | Smartshift Gmbh | Systems and methods for dynamically replacing code objects via conditional pattern templates |
WO2012005739A1 (en) | 2010-07-09 | 2012-01-12 | Hewlett-Packard Development Company, L.P. | Responses to server challenges included in a hypertext transfer protocol header |
US8589405B1 (en) | 2010-07-16 | 2013-11-19 | Netlogic Microsystems, Inc. | Token stitcher for a content search system having pipelined engines |
US8707428B2 (en) | 2010-08-05 | 2014-04-22 | At&T Intellectual Property I, L.P. | Apparatus and method for defending against internet-based attacks |
CA2712542C (en) | 2010-08-25 | 2012-09-11 | Ibm Canada Limited - Ibm Canada Limitee | Two-tier deep analysis of html traffic |
US20120124372A1 (en) | 2010-10-13 | 2012-05-17 | Akamai Technologies, Inc. | Protecting Websites and Website Users By Obscuring URLs |
US8631091B2 (en) | 2010-10-15 | 2014-01-14 | Northeastern University | Content distribution network using a web browser and locally stored content to directly exchange content between users |
US9473530B2 (en) | 2010-12-30 | 2016-10-18 | Verisign, Inc. | Client-side active validation for mitigating DDOS attacks |
AU2011200413B1 (en) | 2011-02-01 | 2011-09-15 | Symbiotic Technologies Pty Ltd | Methods and Systems to Detect Attacks on Internet Transactions |
US8667565B2 (en) | 2011-02-18 | 2014-03-04 | Microsoft Corporation | Security restructuring for web media |
US8732571B2 (en) | 2011-03-31 | 2014-05-20 | Google Inc. | Methods and systems for generating and displaying a preview image of a content area |
US9456050B1 (en) | 2011-04-11 | 2016-09-27 | Viasat, Inc. | Browser optimization through user history analysis |
US8869279B2 (en) | 2011-05-13 | 2014-10-21 | Imperva, Inc. | Detecting web browser based attacks using browser response comparison tests launched from a remote source |
US8555388B1 (en) | 2011-05-24 | 2013-10-08 | Palo Alto Networks, Inc. | Heuristic botnet detection |
US20120324236A1 (en) | 2011-06-16 | 2012-12-20 | Microsoft Corporation | Trusted Snapshot Generation |
US8707434B2 (en) | 2011-08-17 | 2014-04-22 | Mcafee, Inc. | System and method for indirect interface monitoring and plumb-lining |
US8966643B2 (en) | 2011-10-08 | 2015-02-24 | Broadcom Corporation | Content security in a social network |
US8578499B1 (en) | 2011-10-24 | 2013-11-05 | Trend Micro Incorporated | Script-based scan engine embedded in a webpage for protecting computers against web threats |
WO2013091709A1 (en) | 2011-12-22 | 2013-06-27 | Fundació Privada Barcelona Digital Centre Tecnologic | Method and apparatus for real-time dynamic transformation of the code of a web document |
US10049168B2 (en) | 2012-01-31 | 2018-08-14 | Openwave Mobility, Inc. | Systems and methods for modifying webpage data |
US9158893B2 (en) | 2012-02-17 | 2015-10-13 | Shape Security, Inc. | System for finding code in a data flow |
CA2859415C (en) | 2012-02-21 | 2016-01-12 | Logos Technologies, Llc | System for detecting, analyzing, and controlling infiltration of computer and network systems |
US20130227397A1 (en) | 2012-02-24 | 2013-08-29 | Microsoft Corporation | Forming an instrumented text source document for generating a live web page |
US8843820B1 (en) | 2012-02-29 | 2014-09-23 | Google Inc. | Content script blacklisting for use with browser extensions |
EP2642715A1 (en) | 2012-03-20 | 2013-09-25 | British Telecommunications public limited company | Method and system for malicious code detection |
US9111090B2 (en) | 2012-04-02 | 2015-08-18 | Trusteer, Ltd. | Detection of phishing attempts |
GB2501267B (en) | 2012-04-17 | 2016-10-26 | Bango Net Ltd | Payment authentication systems |
US20140089786A1 (en) | 2012-06-01 | 2014-03-27 | Atiq Hashmi | Automated Processor For Web Content To Mobile-Optimized Content Transformation |
US9165125B2 (en) | 2012-06-13 | 2015-10-20 | Mobilextension Inc. | Distribution of dynamic structured content |
US8595613B1 (en) | 2012-07-26 | 2013-11-26 | Viasat Inc. | Page element identifier pre-classification for user interface behavior in a communications system |
US9626678B2 (en) | 2012-08-01 | 2017-04-18 | Visa International Service Association | Systems and methods to enhance security in transactions |
US9313213B2 (en) | 2012-10-18 | 2016-04-12 | White Ops, Inc. | System and method for detecting classes of automated browser agents |
US8806627B1 (en) | 2012-12-17 | 2014-08-12 | Emc Corporation | Content randomization for thwarting malicious software attacks |
RU2522019C1 (en) | 2012-12-25 | 2014-07-10 | Закрытое акционерное общество "Лаборатория Касперского" | System and method of detecting threat in code executed by virtual machine |
US9584543B2 (en) | 2013-03-05 | 2017-02-28 | White Ops, Inc. | Method and system for web integrity validator |
US9225737B2 (en) | 2013-03-15 | 2015-12-29 | Shape Security, Inc. | Detecting the introduction of alien content |
US8869281B2 (en) | 2013-03-15 | 2014-10-21 | Shape Security, Inc. | Protecting against the introduction of alien content |
US9338143B2 (en) | 2013-03-15 | 2016-05-10 | Shape Security, Inc. | Stateless web content anti-automation |
US20140281535A1 (en) | 2013-03-15 | 2014-09-18 | Munibonsoftware.com, LLC | Apparatus and Method for Preventing Information from Being Extracted from a Webpage |
US20140283038A1 (en) | 2013-03-15 | 2014-09-18 | Shape Security Inc. | Safe Intelligent Content Modification |
US9729514B2 (en) | 2013-03-22 | 2017-08-08 | Robert K Lemaster | Method and system of a secure access gateway |
US9424424B2 (en) | 2013-04-08 | 2016-08-23 | Trusteer, Ltd. | Client based local malware detection method |
WO2014168936A1 (en) | 2013-04-10 | 2014-10-16 | Ho Lap-Wah Lawrence | Method and apparatus for processing composite web transactions |
US10320628B2 (en) | 2013-06-19 | 2019-06-11 | Citrix Systems, Inc. | Confidence scoring of device reputation based on characteristic network behavior |
WO2015001535A1 (en) | 2013-07-04 | 2015-01-08 | Auditmark S.A. | System and method for web application security |
US9015839B2 (en) | 2013-08-30 | 2015-04-21 | Juniper Networks, Inc. | Identifying malicious devices within a computer network |
US9961129B2 (en) | 2013-09-04 | 2018-05-01 | Cisco Technology, Inc. | Business transaction correlation with client request monitoring data |
US9270647B2 (en) | 2013-12-06 | 2016-02-23 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US8954583B1 (en) | 2014-01-20 | 2015-02-10 | Shape Security, Inc. | Intercepting and supervising calls to transformed operations and objects |
US9225729B1 (en) | 2014-01-21 | 2015-12-29 | Shape Security, Inc. | Blind hash compression |
US20150262182A1 (en) | 2014-03-12 | 2015-09-17 | The Toronto-Dominion Bank | Systems and methods for providing populated transaction interfaces based on contextual triggers |
US8997226B1 (en) | 2014-04-17 | 2015-03-31 | Shape Security, Inc. | Detection of client-side malware activity |
US9405910B2 (en) | 2014-06-02 | 2016-08-02 | Shape Security, Inc. | Automatic library detection |
US9667637B2 (en) | 2014-06-09 | 2017-05-30 | Guardicore Ltd. | Network-based detection of authentication failures |
US20150379266A1 (en) | 2014-06-26 | 2015-12-31 | DoubleVerify, Inc. | System And Method For Identification Of Non-Human Users Accessing Content |
US9075990B1 (en) | 2014-07-01 | 2015-07-07 | Shape Security, Inc. | Reliable selection of security countermeasures |
WO2016004227A1 (en) | 2014-07-02 | 2016-01-07 | Blackhawk Network, Inc. | Systems and methods for dynamically detecting and preventing consumer fraud |
US9686300B1 (en) | 2014-07-14 | 2017-06-20 | Akamai Technologies, Inc. | Intrusion detection on computing devices |
US9639699B1 (en) | 2014-07-18 | 2017-05-02 | Cyberfend, Inc. | Detecting non-human users on computer systems |
US9438625B1 (en) | 2014-09-09 | 2016-09-06 | Shape Security, Inc. | Mitigating scripted attacks using dynamic polymorphism |
US9716726B2 (en) | 2014-11-13 | 2017-07-25 | Cleafy S.r.l. | Method of identifying and counteracting internet attacks |
US9906544B1 (en) | 2014-12-02 | 2018-02-27 | Akamai Technologies, Inc. | Method and apparatus to detect non-human users on computer systems |
US9544318B2 (en) | 2014-12-23 | 2017-01-10 | Mcafee, Inc. | HTML security gateway |
US9813440B1 (en) | 2015-05-15 | 2017-11-07 | Shape Security, Inc. | Polymorphic treatment of annotated content |
US9986058B2 (en) | 2015-05-21 | 2018-05-29 | Shape Security, Inc. | Security systems for mitigating attacks from a headless browser executing on a client computer |
WO2017007705A1 (en) | 2015-07-06 | 2017-01-12 | Shape Security, Inc. | Asymmetrical challenges for web security |
WO2017007936A1 (en) | 2015-07-07 | 2017-01-12 | Shape Security, Inc. | Split serving of computer code |
US10264001B2 (en) | 2015-08-12 | 2019-04-16 | Wizard Tower TechnoServices Ltd. | Method and system for network resource attack detection using a client identifier |
US9807113B2 (en) | 2015-08-31 | 2017-10-31 | Shape Security, Inc. | Polymorphic obfuscation of executable code |
US20170118241A1 (en) | 2015-10-26 | 2017-04-27 | Shape Security, Inc. | Multi-Layer Computer Security Countermeasures |
US10375026B2 (en) | 2015-10-28 | 2019-08-06 | Shape Security, Inc. | Web transaction status tracking |
US10326790B2 (en) | 2016-02-12 | 2019-06-18 | Shape Security, Inc. | Reverse proxy computer: deploying countermeasures in response to detecting an autonomous browser executing on a client computer |
US10855696B2 (en) | 2016-03-02 | 2020-12-01 | Shape Security, Inc. | Variable runtime transpilation |
US9917850B2 (en) | 2016-03-03 | 2018-03-13 | Shape Security, Inc. | Deterministic reproduction of client/server computer state or output sent to one or more client computers |
-
2014
- 2014-01-21 US US14/160,107 patent/US9225729B1/en active Active
-
2015
- 2015-12-28 US US14/980,231 patent/US10212137B1/en active Active
-
2018
- 2018-12-30 US US16/236,566 patent/US20190140835A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6343313B1 (en) * | 1996-03-26 | 2002-01-29 | Pixion, Inc. | Computer conferencing system with real-time multipoint, multi-speed, multi-stream scalability |
US7051362B2 (en) * | 2000-05-16 | 2006-05-23 | Ideaflood, Inc. | Method and system for operating a network server to discourage inappropriate use |
US20060085541A1 (en) * | 2004-10-19 | 2006-04-20 | International Business Machines Corporation | Facilitating optimization of response time in computer networks |
US8855143B1 (en) * | 2005-04-21 | 2014-10-07 | Joseph Acampora | Bandwidth saving system and method for communicating self describing messages over a network |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10536479B2 (en) | 2013-03-15 | 2020-01-14 | Shape Security, Inc. | Code modification for automation detection |
US11088995B2 (en) | 2013-12-06 | 2021-08-10 | Shape Security, Inc. | Client/server security by an intermediary rendering modified in-memory objects |
US10554777B1 (en) | 2014-01-21 | 2020-02-04 | Shape Security, Inc. | Caching for re-coding techniques |
US10382482B2 (en) | 2015-08-31 | 2019-08-13 | Shape Security, Inc. | Polymorphic obfuscation of executable code |
US11579985B2 (en) * | 2019-05-31 | 2023-02-14 | Acronis International Gmbh | System and method of preventing malware reoccurrence when restoring a computing device using a backup image |
Also Published As
Publication number | Publication date |
---|---|
US9225729B1 (en) | 2015-12-29 |
US10212137B1 (en) | 2019-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190140835A1 (en) | Blind Hash Compression | |
US11171925B2 (en) | Evaluating and modifying countermeasures based on aggregate transaction status | |
US11070557B2 (en) | Delayed serving of protected content | |
US9973519B2 (en) | Protecting a server computer by detecting the identity of a browser on a client computer | |
US10193909B2 (en) | Using instrumentation code to detect bots or malware | |
US9411958B2 (en) | Polymorphic treatment of data entered at clients | |
US9858440B1 (en) | Encoding of sensitive data | |
US20170118241A1 (en) | Multi-Layer Computer Security Countermeasures | |
US9489526B1 (en) | Pre-analyzing served content | |
US9325734B1 (en) | Distributed polymorphic transformation of served content | |
Kim et al. | WebMon: ML-and YARA-based malicious webpage detection | |
US9112900B1 (en) | Distributed polymorphic transformation of served content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SHAPE SECURITY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MOEN, DANIEL G;HANKS, BRYAN D;SIGNING DATES FROM 20140117 TO 20140120;REEL/FRAME:047874/0100 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |