The way we protect big data privacy is by mathematically decomposing big data into randomized fragments that are un-recognizable, un-linkable, and un-interpretable; these shredded fragments can then be sent over the internet or stored in multiple clouds, servers, or devices with metadata privacy. The attackers would need to search through the sea of big data in order to identify the fragments that belong to a particular record. What we are building is essentially a big data shredder that disperses the information of big data from different software applications into un-recognizable fragments before storing or communicating the data, by doing so, this technology provides the distributed trust in protecting the big data privacy. Big data is protected by different security mechanisms whether they are at rest or in transit using our proposed technology. Hackers will have to gain access to multiple storage locations and communication channels in order to retrieve the shredded fragments and recover the original records. For authorised users, the original data can be reconstructed instantaneously in any software applications, environments, or platforms using the metadata information and the corresponding shredded data fragments. Any security breach to the software application is confined to the amount of metadata loss during the attacks, thus our proposed technology differentiates ourselves from encryption because any loss of encryption keys may lead to massive data loss. As a mathematical technique, our proposed big data privacy solution can be easily integrated and combined with existing privacy-preserving technologies such as anonymization and encryption technology.
The proposed big data dispersal technology works on both structured and unstructured data. The current implementation focuses on biometric (e.g., facial, audio, and gait) and healthcare data (e.g., MRI, CT-scan, and sensors data). Encryption and access control are applied on both the metadata and shredded data fragments in decentralized manner so that there is no complicated encryption key management and distribution. Any software applications can reconstruct the original records if they are given authorized access to the metadata information and shredded fragments. The current implementation provides layered protection and distributed trust in ensuring data-at-rest-security and data-in-motion-security.
Big data is routinely being collected by enterprises to provide real-time analytics. The proposed big data dispersal technology helps enterprises to protect big data privacy in complex software environments nowadays. The privacy-preserving applications include biometric, healthcare, and IOTs data that could potentially leak a lot of sensitive information.
The market size of data security stands at ~SGD16 billion per year, most of them are spent on anonymization, shredding, encryption, and access control technology. Our aim is to provide low-cost, subscription-based solution so that the proposed technology could be easily adopted by the mass market, e.g., SMEs and MNCs. As more enterprises are migrating their software applications to public clouds, the proposed technology benefits the enterprises by providing distributed trust in complex software environments with increasing attack vectors nowadays.
Big data encryption is complicated in terms of key management and distribution; whereas existing shredder technology require centralized servers to manage the large metadata files. Our proposed technology does not require centralized servers for metadata management and hence completely removes the single point of failure in big data privacy protection.