Comcast's latest open source contribution takes a distinct cybersecurity angle in the form of a software tool that uses a blend of artificial intelligence (AI) and machine learning (ML) techniques to uncover and expose "secrets" that are inadvertently left in open source code.
Such secrets in open source repositories could include credentials, such as passwords, as well as API keys and tokens.
Figure 1:
(Source: Marcos Alvarado/Alamy Stock Photo)
Comcast originally developed this open source data security tool, called "xGitGuard," to ensure the company was not exposing its own secrets in software placed in GitHub repositories. The tool, which has been in use at Comcast since 2020, also utilizes natural language processing and text processing algorithms developed in-house.
Under both models supported by xGitGuard (passwords/credentials and API keys and tokens), the tool follows a six-step process: search GitHub at scale, filter results, detect and extract secrets, developer identification, validate secrets and then submit for remediation.
Figure 2: Click here for a larger version of this image.
(Source: Comcast's xGitGuard page. Used with permission)
A general aim is to detect those secrets at scale and with a high level of accuracy when compared to other, similar tools available in the open source world, according to Dr. Bahman Rashidi, director of Comcast Cable's Cybersecurity and Privacy Engineering Research team.
Comcast estimates in this blog post that the validation element of the tool is over 90% accurate in recognizing secrets from non-secret texts.
But the broader goal is to help developers and organizations be in a better position to detect credentials and other secrets that might have been unknowingly and inadvertently exposed publicly. The tool can be used to detect those secrets in code already posted to GitHub or in a preventative and proactive manner before code ever makes its way to GitHub.
"If [passwords and API tokens] are in the wrong hands, that's going to have some consequences for organizations," explained Rashidi, the developer of xGitGuard. He said potential exposure to passwords and API tokens in open source code is a "very known problem in the industry. It happens frequently."
Comcast's work on xGitGuard emerges from a group at the company that spearheads and streamlines various open source projects. Last year, Comcast had two open source projects – Kuberhealthy and Trickster – accepted as "Sandbox" projects by the Cloud Native Computing Foundation (CNCF).
Kuberhealthy performs real-time monitoring of the health of Kubernetes clusters and streams that data into a Prometheus dashboard that tracks cloud native applications. Trickster accelerates how dashboards are rendered to Comcast from Prometheus.
Related posts:
— Jeff Baumgartner, Senior Editor, Light Reading