
Denas Grybauskas Oxylabs is the chief rule and strategy officer, a global leader in web intelligence collections and premium proxy solutions.
Established in 2015, Oxilabes provides one of the largest morally citrus proxy networks in the world-Extension of more than 177 million IPS in 195 countries with advanced equipment, web scraper API, and Oxicopillot, with advanced equipment such as AI-operated scraping assistants, which converts natural language from a structural form.
You have an impressive legal and governance journey at the legal technical place of Lithuania. What did you personally inspire to deal with one of the most polarization challenges of AI in his role in Oxlabs?
Oxilab has always been a flag bearer for the innovation responsible in the industry. We were the first to advocate moral proxy sourcing and web scrapping industry standards. Now, as AI proceeds so fast, we should make sure that innovation is balanced with responsibility.
We saw it as a major problem in front of the AI industry, and we could also see solutions. By providing these datasets, we are capable of being on the same page in relation to Fair AI development, which is beneficial for all. We knew how important it was to put the rights of the creators at the forefront, but also provides material for the development of the future AI system, so we made these datasets something that could meet the demands of today’s market.
The UK is in the middle of a hot copyright battle, with a strong voice on both sides. How do you explain the current status of debate between AI innovation and manufacturer rights?
Although it is important that the UK government in favor of productive technological innovation as a priority, it is important that the creators should feel extended and preserved by AI, not from theft. Currently, under the debate, the legal structure should find a sweet place between promoting innovation and, as well, protecting the creators, and I hope that in the coming weeks we find a way to make them a balance.
Oxylabs has just launched the world’s first moral YouTube dataset, which requires manufacturer consent for AI training. How does this consent process actually work – and how scalable is it for other industries like music or publication?
All of the millions of original videos in the dataset have a clear consent of the creators used for AI training, adding creators and innovators morally. All datasets given by Oxlabs include videos, tapes and rich metadata. While such data contains many potential use cases, Oxlabs refined and prepared it specifically for AI training, which is the use that material creators have deliberately agreed.
Many technical leaders argue that a clear opt-in requirement from all creators can “kill” the AI industry. What is your response to that claim, and how do the Oxilab’s approach otherwise prove?
It is necessary that, for each use of the material for AI training, the previous clear opt-in presents important operating challenges and will come at a significant cost for AI innovation. Instead of protecting the rights of the creators, it can inadvertently encourage companies to move development activities in courts with less rigorous enforcement or separate copyright regime. However, this does not mean that there can be no middle ground where AI development is encouraged while copyright is honored. Conversely, whatever we need are working mechanisms that simplify relations between AI companies and creators.
These datasets provide an approach to proceed. The opt-out model, according to which the material can be used until the copyright owner clearly exits, is another. The third way will facilitate deal-making through online platforms such as technical solutions between publishers, creators and AI companies.
Eventually, any solution should be operated within the limit of copyright and data safety laws. In oxilabs, we believe that AI innovation should be extended responsibly, and our goal is to contribute to legitimate, practical outlines that honor the creators by enabling progress.
What were the biggest obstacles to make your team consent-based dataset feasible?
The path for us was opened by YouTube, allowing material creators to license their work for AI training easily and easily. After that, our work was mostly technical, including collecting data, cleaning it to prepare a dataset and structuring it, and the entire technical setup for companies would have to form so that they could reach the required data. But this is something we have been doing for years, in a way or another. Of course, each case presents its own set of challenges, especially when you are working with huge and complex as multimodal data. But we had both knowledge and technical ability to do so. Given this, once YouTube authors got a chance to agree, the rest was only a matter of putting our time and resources in it.
Beyond YouTube content, do you imagine a future where other major material types – such as music, writing, or digital art – can be licensed systematically for use as data?
For some time, we are indicating the need for a systematic approach to enabling AI innovation and enabling AI innovation by creating a balance with the manufacturer’s rights. Only when there is a convenient and cooperative way to achieve their goals for both sides, there will be mutual benefit.
This is just the beginning. We believe that providing a dataset like us in a series of industries can provide a solution that eventually brings copyright debate into a cordial close.
Does the importance of offerings like oxilabes -like moral datasets different in the European Union, UK and other courts have different importance?
On the one hand, the availability of clear-peace-based dataset levels the region in the regional sector for AI companies, where governments tilt towards strict regulation. The primary concern of these companies is that, instead of supporting the creators, strict rules to obtain consent will give only an unfair advantage to AI developers in other courts. The problem is not that these companies do not care about consent, but without a convenient way to achieve it, they lag behind.
On the other hand, we believe that if it is easy to provide consent and reach licensed data for AI training, there is no reason that this approach should not become a preferred method globally. Our datasets built on licensed YouTube content are a step towards this simplification.
How AI is trained, with the growing public mistrust, how do you think that transparency and consent can become a competitive advantage for technical companies?
Although transparency is often seen as a barrier to competitive edge, it is also our biggest weapon to fight mistrust. More transparency AI companies can provide, the greater evidence for moral and beneficial AI training, which leads to the reconstruction of the trust in the AI industry. And in return, the manufacturer will be more due to the future to give value in future, given that they and society can get value from AI innovation.
Oxlabs are often associated with data scraping and web intelligence. How does this new moral initiative fit in the broad vision of the company?
The release of morally sour youtube dataset continues our mission in oxilabs to establish and promote moral industry practices. As part of it, we co-install the moral web data collection initiative (EWDCI) and introduced an industry-first transparent level structure for proxy sourcing. We also launched Project 4β as part of their mission to enable researchers and academics to maximize their research effects and enable the understanding of important public web data.
Looking forward, do you think governments should make the consent-by-dosha compulsory for training data, or should it be made an initiative led by a voluntary industry?
In a free market economy, it is generally best to allow the market to correct itself. By allowing innovation in response to market needs, we constantly reinforce our prosperity and renew our prosperity. Heavy law is never a good first choice and should only be resorted to when all other avenues have ended to ensure justice by allowing innovation to end.
It does not seem that we have already reached that point in AI training. YouTube licensing options for creators and our datasets show that this ecosystem is actively looking for ways to adapt to new realities. Thus, while clear regulation, of course, needs to ensure that everyone works within their rights, governments may want to spread lightly. Instead of the need for expressed consent in every case, they may want to check the methods that industry can develop mechanisms to solve the current stress and take their signs from the time when a law is enacted to encourage innovation rather than obstructed.
What advice will you recommend to startups and AI developers who want to prioritize moral data usage without stopping innovation?
In a way, startups can help facilitate the use of moral data, developing technical solutions. It is from the creators to simplify the process of obtaining consent and obtaining value. As an alternative to obtaining a transparently sour data, AI companies do not need to compromise on speed; Therefore, I advise them to keep their eyes open for such offerings.
Thanks for the great interview, those who want to learn more, should visit Oxilabes.