- Audience:
- Community
- Enterprise
Archie is powered by our own private instance of Meta Llama 3.
Archie does not use data from customer datasets. Depending on the task, we wrap the needed contextual metadata from the knowledge graph - including resource type, description, raw sql, etc. into a prompt we send to the AI model. Where applicable (e.g. text-to-sql), we also wrap user inputs to get better responses.
No. Archie is powered by a foundation model. We do not train or fine-tune the AI model or Archie with customer metadata, data, or inputs. We assess the accuracy of Archie outputs via user feedback (within the app and via support) as well as benchmark testing with our own test metadata.
Our identity and access management controls are designed to prevent unauthorized access to data across customer accounts, and we employ standard industry security practices designed to ensure that data are encrypted during transmission on the network as well as encrypted at rest. Information is passed along as transient state information to our private instance of the AI model.
Customers who utilize our platform and features, including Archie, agree to abide by our Acceptable Use Policy. By using Archie, customers and their users also agree to abide by Meta’s Llama 3 license. Customers remain responsible and liable for the use or distribution of their inputs and the outputs generated by Archie. Any outputs that users decide to keep, manage, and maintain, are the sole responsibility of those users.
Archie does not require raw data to function nor does Archie attempt to mask PII or sensitive data. Insomuch as there might be PII or sensitive data in your catalog or user inputs, Archie will process it.
User inputs and Archie outputs are stored as logs for diagnostic purposes. Current retention is 7 days. Of the Archie outputs a user has decided to save to the user’s graph, retention is at the customer’s discretion (e.g. save a suggested description, questions, and queries).
Access to data.world is controlled via SSO, and this data is not accessible by or given to third-party generative AI systems. Our identity and access management controls are designed to prevent unauthorized access to data across customer accounts, and we employ industry standard security practices designed to ensure that data are encrypted during transmission on the network as well as at rest. Information is passed along as transient state information to our private instance of the AI model.
data.world runs an internally hosted LLM, distributed across multiple nodes. We have standard rate limiting and usage thresholds in place which are designed to govern access and isolate user traffic for uptime and performance purposes across our user base.
No. Archie is powered by a foundation model. We do not train or fine-tune Archie with customer metadata, data, or inputs. We assess the accuracy of Archie outputs via user feedback (within the app and via support) as well as benchmark testing with our own test metadata.
No. We do not share customer data with third parties to provide the Generative AI services. Archie does not use data from customer datasets. Depending on the task, we wrap the needed contextual metadata from the knowledge graph - including resource type, description, raw sql, etc. into a prompt we send to our internally hosted AI model. Where applicable (e.g. text-to-sql), we also wrap user inputs to get better responses.
For instance, when suggesting a description for a table or column, is it based on just the resource name, other table and column names, other descriptions in the collection, or data from datasets?
Archie does not use data from customer datasets. Depending on the task, we wrap the needed contextual metadata from the knowledge graph - including resource type, description, raw sql, etc. into a prompt we send to the model. Where applicable (e.g. text-to-sql), we also wrap user inputs to get better responses.