Home Data Security 3 core principles for secure data integration

3 core principles for secure data integration

by WeeklyAINews
0 comment

Be a part of high executives in San Francisco on July 11-12, to listen to how leaders are integrating and optimizing AI investments for fulfillment. Learn More


In terms of knowledge, sharing isn’t all the time caring.

Sure, the elevated circulation of information throughout departments like advertising and marketing, gross sales, and HR is doing a lot to energy higher decision-making, improve buyer expertise, and — in the end — enhance enterprise outcomes. However this has critical implications for safety and compliance.

This text will talk about why, then current three core ideas for the safe integration of information.

Democratizing entry to knowledge: An necessary caveat 

Available on the market as we speak is an unbelievable vary of no-code and low-code tools for transferring, sharing and analyzing knowledge. Extract, remodel, load (ETL) and extract, load, remodel (ELT) platforms, iPaaS platforms, knowledge visualization apps, and databases as a service — all of those can be utilized comparatively simply by non-technical professionals with minimal oversight from directors.

Furthermore, the variety of SaaS apps that companies use as we speak is constantly growing, so the necessity for self-serve integrations will doubtless solely enhance.

Many such apps, comparable to CRMs and EPRs, comprise delicate buyer knowledge, payroll knowledge, invoicing knowledge and so forth. These are inclined to have strictly managed entry ranges, so so long as the info stays inside them, there isn’t a lot of a safety danger. 

However, as soon as you are taking knowledge out of those environments and feed them to downstream methods with fully totally different entry stage controls, there emerges what we are able to time period “entry management misalignment.” 

Folks working with ERP knowledge in a warehouse, for instance, might not have the identical stage of confidence from firm administration as the unique ERP operators. So, by merely connecting an app to an information warehouse — one thing that’s an increasing number of usually changing into mandatory — you run the danger of leaking delicate knowledge.

This can lead to violation of laws like GDPR in Europe or HIPAA within the U.S., in addition to necessities for knowledge safety certifications like SOC 2 Kind 2, to not point out stakeholder belief.

Three ideas for safe knowledge integration

Tips on how to stop the pointless circulation of delicate knowledge to downstream methods? Tips on how to hold it safe in case it does must be shared? And in case of a possible safety incident, how to make sure that any harm is mitigated?

See also  Google Cloud and CSA: 2024 will bring significant generative AI adoption in cybersecurity, driven by C-suite

These questions might be addressed by the three ideas beneath.

Separate issues

By separating knowledge storage, processing and visualization features, companies can reduce the danger of information breaches. Let’s illustrate how this works by instance.

Think about that you’re an ecommerce firm. Your principal manufacturing database — which is linked to your CRM, cost gateway and different apps — shops all of your stock, buyer, and order information. As your organization grows, you resolve it’s time to rent your first knowledge scientist. Naturally, the very first thing they do is ask for entry to datasets with all of the abovementioned data in order that they’ll write knowledge fashions for, let’s say, how the climate impacts the ordering course of, or what the most well-liked merchandise is in a selected class.

However, it’s not very sensible to provide the info scientist direct entry to your principal database. Even when they’ve the most effective of intentions, they might, for instance, export delicate buyer knowledge from that database to a dashboard that’s viewable by unauthorized customers. Moreover, working analytics queries on a manufacturing database can gradual it right down to the purpose of inoperability.

The answer to this downside is to obviously outline what sort of knowledge must be analyzed and, through the use of numerous data replication techniques, to repeat knowledge right into a secondary warehouse designed particularly for analytics workloads comparable to like Redshift, BigQuery or Snowflake.

On this method, you stop delicate knowledge from flowing downstream to the info scientist, and on the identical time give them a safe sandbox setting that’s fully separate out of your manufacturing database.

Unique picture by Dataddo

Use knowledge exclusion and knowledge masking strategies

These two processes additionally assist separate issues as a result of they stop the circulation of delicate data to downstream methods solely.

In actual fact, most knowledge safety and compliance points can truly be solved proper when the info is being extracted from apps. In spite of everything, if there is no such thing as a good cause to ship buyer phone numbers out of your CRM to your manufacturing database, why do it? 

See also  Unlocking the Power of ChatGPT in Data Science

The thought of information exclusion is straightforward: When you have a system in place that means that you can choose subsets of information for extraction like an ETL tool, you possibly can merely not choose the subsets that comprise delicate knowledge.

Bu, in fact, there are some conditions when delicate knowledge must be extracted and shared. That is the place data masking/hashing is available in.

Let’s say, as an example, that you simply wish to calculate well being scores for purchasers and the one smart identifier is their e mail tackle. This could require you to extract this data out of your CRM to your downstream methods. To maintain it safe from finish to finish, you possibly can masks or hash it upon extraction. This preserves the individuality of the data, however makes the delicate data itself unreadable.

Each knowledge exclusion and knowledge masking/hashing may be achieved with an ETL software.

As a aspect observe, it’s price mentioning that ETL instruments are usually thought of safer than ELT instruments as a result of they permit knowledge to be masked or hashed earlier than they’re loaded into the goal system. For extra data, seek the advice of this detailed comparability of ETL and ELT tools.

Preserve a robust system of auditing and logging in place

Lastly, make sure that there are methods in place that allow you to know who’s accessing knowledge and the way and the place the info is flowing.

After all, that is necessary for compliance as a result of many laws require organizations to display that they’re monitoring entry to delicate knowledge. Nevertheless it’s additionally important for shortly detecting and reacting to any suspicious conduct.

Auditing and logging is each the inner duty of the businesses themselves and the duty of the distributors of information instruments, like pipelining options, knowledge warehouses and analytics platforms.

So, when evaluating such instruments for inclusion in your knowledge stack, it’s necessary to concentrate to whether or not they have sound logging capabilities, role-based entry controls, and different safety mechanisms like multi-factor authentication (MFA). SOC 2 Kind 2 certification can be an excellent factor to search for as a result of it’s the usual for the way digital firms ought to deal with buyer knowledge.

See also  Credal aims to connect company data to LLMs 'securely'

This manner, if a possible safety incident ever does happen, it is possible for you to to conduct a forensic evaluation and mitigate the harm.

Entry vs. safety: Not a zero-sum sport

As time goes on, companies will more and more be confronted with the necessity to share knowledge, in addition to the necessity to hold it safe. Luckily, assembly one among these wants doesn’t must imply neglecting the opposite.

The three ideas outlined above can underlie a safe knowledge integration technique in organizations of any dimension.

First, establish what knowledge may be shared after which copy it right into a safe sandbox setting.

Second, at any time when potential, hold delicate datasets in supply methods by excluding them from pipelines, and you’ll want to hash or masks any delicate knowledge that does must be extracted.

Third, be sure that your corporation itself and the instruments in your knowledge stack have sturdy methods of logging in place, in order that if something goes incorrect, you possibly can reduce harm and examine correctly.

Petr Nemeth is the founder and CEO of Dataddo.

Source link

You may also like

logo

Welcome to our weekly AI News site, where we bring you the latest updates on artificial intelligence and its never-ending quest to take over the world! Yes, you heard it right – we’re not here to sugarcoat anything. Our tagline says it all: “because robots are taking over the world.”

Subscribe

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

© 2023 – All Right Reserved.