Revolutionizing Job Descriptions with API-Driven Text Generation: A New Era in Recruitment

A journalist called me last week asking about the use of generative models to rewrite job-related sentences and publish them on the internet. Personally, I think it's pointless. Firstly, almost everyone nowadays writes content in certain ways, and if you start generating it, Google will immediately recognize the text as AI-generated and remove it. I constantly see on the market, and on various social networks, that websites get dropped from the index for using generative texts. As an experiment, of course, you could try generating a few job sentences on a site like GPT, or even rewriting them. However, rewriting might be expensive since they count the tokens going in and out, making it twice as expensive as just inputting them.

If, for instance, you use a different framework like spaCy to create a short summary of a text and then use this summary, derived from job descriptions, as a prompt to generate a full job description, this method might be more cost-effective and possibly unique enough to avoid being flagged and removed by Google. It's worth experimenting with this approach, maybe trying it out with a client to see the response—whether it drives traffic and engagement or not.

But the generator and their APIs appear very simple, almost like a child's toy. It's hard to believe that these people made millions with this product. When you look at it, it seems like a complete joke, yet with this 'clownery,' they have managed to surpass Google. I find myself constantly thinking that I've almost stopped using Google altogether, spending all my time in the chat, with most of my queries now going there rather than to Google. If someone had told me a year ago, when people from Amsterdam came and asked us about all these models, how to build them, why and for what purposes data banks are used, and how to set them up, I would never have believed that it would go this far and become such a huge success, or anything else for that matter.

I always saw it as a rhetorical question that didn't even require an answer. I thought it was just people playing around, not realizing they were wasting their time. It seemed similar to people who create toys, play with them themselves, and then make them for others. However, when you look at the statistics and see that 75% of the earnings in Google's Play Store are from games, it means that, essentially, 75% of humanity doesn't need anything except games. Can you imagine the vast number of people who simply use technology for entertainment? And yet, amidst this, there is such a useful thing as...

I never would have thought that something as simple as chatting with a model would become so popular and widely used for text generation. I now feel that it greatly helps, especially in other languages that are not my native languages or those that I use daily.

If even correspondents themselves are calling and journalists are asking about it, it would be embarrassing not to try and see how it really works and what it can bring to real clients. It's not just about writing thousands of pieces in a row, but also about using different methods for clients who are now actually looking for job offers. It's not just about rewriting, but optimizing them very well for search engines and correctly reformulating the text. Because in 90% of the cases, when I read job offers in any language, including English, I often want to cry over how they are written and how poorly they are optimized for search queries.

This leads to a decrease in the number of candidates responding to job offers because they simply can't find them due to poor optimization. The keywords are bad, and the text itself is poorly written, which is indeed a problem. However, it's not in the context that the person who approached me was talking about. They were discussing generated and published offers. On the other hand, people themselves write such poor job descriptions. If you're not from that sector, you wouldn't even know that you need to have worked in the sector for ten years to write a good job offer, taking into account all the nuances and details. It seems like just one page of text, but it's an entire science, similar to writing an article for a newspaper or Wikipedia. Each of these is a different format, and a job offer is its own format that people currently do not understand well. Even those who have worked in the sector for a long time don't understand how to optimize a job offer for search engines.

So, what we'll test this weekend is using the chat GPTi Turbo to generate and create a good lead that will be a quality job offer. We'll rewrite it in such a way that it can attract as many candidates as possible. From this, we'll be able to achieve and create our own model specifically designed for these parameters, based on a large amount of data.

The goal is to ensure that the terminology used in the skills section of resumes by job seekers aligns well, statistically, with the description of the job itself. This means that people who use their skills in their resume descriptions are also searching for jobs using these skills, among other things. People don't just search for jobs; they search based on the titles of their previous positions and skills.

The queries themselves work quite interestingly, but the result is, of course, far from the expected level of optimization.

In the first version, we won't be creating any summary or anything like that. We'll simply take the existing text description without HTML, the one we are already using in the description provided by the client, and submit it as is. This will be combined with the long input that I've created using their docs cookbook.

Well, there's probably still a lot of work to be done to ensure that the prompt truly delivers the desired results. I've tried rewriting a few with my quickly generated input, and it turns out the job offers produced are immediately recognizable as not unique. They are quickly found through a search, meaning Google will also identify them and remove. Therefore, specifically for search purposes, those that have already been published will either have to be linked back to the Source stepstone or be deleted independently. Hence, the focus is clear: to achieve the desired quality.

You can experiment to see what percentage of unique text is needed. I already see that even if we were to rewrite using a different model, not necessarily from this provider but maybe another one, the results could be better. We don't need any overly complex model; we can just calmly use statistical methods now to rewrite these queries. In principle, even through a store, the text is generated better, as we did last time for automotive sector. Together with copywirter. Rewriting through the GPT chat using our own statistical methods through spaCy would yield better results than what is currently produced, which is almost entirely non-unique and can be traced back to the original texts. Therefore, it is unlikely to be successful. I don't know what percentage would be successful if we made a selection, not from ten but from ten thousand, because I don't know what percentage would work since this needs to be automated and checked automatically through a skate as well, but the first manual results are not worth the effort.

Overall, after testing about a hundred texts, it's clear that rewriting through the GPT chat is a complete dead end. Anyone doing this now is wasting time and money; it definitely won't work. After analyzing these texts, they are too generic and there are too many matches found on the internet. Almost immediately, 3-4 websites are found with exactly the same parts of these job descriptions. So, this is a completely useless endeavor. Google will connect them and display them where the content originally appeared. Therefore, it might be better to try rewriting through several iterations or something similar. But as it stands, just one rewrite doesn't accomplish anything.

What yielded good results was our algorithm, fully written with the help of Spacy's and statistics and with .... This way, we got unique text, though it was a bit awkward and not readable. To format it, make it look nice, and then insert it, for example, formatting it with HTML, GPT coped well with this task. But initially, the text given to it must be unique; GPT struggles with unifying text and AI detection. A person should provide it with a unique text that they have written themselves, and GPT then grammatically corrects them and flips it with HTML using such routines. It does quite well with these challenging conditions. Perhaps it's even intentionally set up this way so that it doesn't unify too well, otherwise search engines would immediately be in problems the content generated by millions of users that copied from other users, will kill internet like it is. No matter what you write, it doesn't get copied via GPT and if it copied Google will see it.

The conclusion drawn from this experience is that while generative AI models like GPT can be useful for certain tasks, they have limitations when it comes to creating unique, high-quality content, particularly for specific applications like job descriptions. The tested method of using a specialized algorithm, which combines statistical analysis from tools like Spacy with manual input, appears to be more effective. This approach produces content that is unique enough to stand out to search engines, avoiding duplication issues that are common with purely AI-generated text.

However, the resulting text often requires further editing and formatting, such as using HTML, to ensure it meets quality standards. This indicates that while AI can significantly aid in the content creation process, human intervention is still crucial for fine-tuning and ensuring the content's relevance and appeal. Additionally, this strategy might be designed to prevent the AI from making the content too uniform, thus avoiding the potential pitfalls of search engine penalties for non-unique content.

In summary, the combination of AI and human oversight seems to be the most effective strategy for generating unique and high-quality job descriptions. This hybrid approach leverages the efficiency and capabilities of AI while relying on human expertise for quality assurance and the nuances of effective communication.

Thierry Breton European AI Act

Thierry Breton, a member of the European Commission, confirmed that an agreement has been reached on the European AI Act, marking a significant step in the regulation of artificial intelligence in the EU. This act, which is expected to take effect in about two years, aims to strike a balance between fostering innovation and protecting the rights of individuals and companies. The draft regulations are set to be published in approximately 12 months, with the final details going into effect around 24 months, making it unlikely that the EU AI Act will become law until 2026.

The act addresses various aspects of AI usage, including transparency and governance requirements. It has provisions for banning certain applications of AI that could threaten citizens' rights and democracy. These prohibitions include biometric categorization systems using sensitive characteristics, predictive policing, untargeted scraping of facial images to create recognition databases, emotion recognition in workplaces and educational institutions, social scoring based on social behavior, manipulation of human behavior to circumvent free will, and AI used to exploit vulnerabilities in people due to age, disability, or social status.

Moreover, the act imposes stringent sanctions for non-compliance, with fines ranging up to €35 million or 7% of global turnover, depending on the infringement and the size of the company. The legislation also includes an "updating mechanism" to adapt to the constant changes in technology.

The EU AI Act also delineates a two-tier approach for large AI models, like GPT-4, with transparency requirements for all general-purpose AI models and stronger requirements for those with systemic impacts. This includes a system to assess and tackle systemic risks and aligning on clear definitions to provide legal certainty to model developers.

One key aspect of the agreement is its approach towards open-source AI, which has very limited and light requirements, potentially paving the way for the success of open-source generative AI foundation models. This decision reflects a careful consideration of how far to restrict live biometric identification tools, with the final agreement leaning towards limited use in public spaces with additional safeguards.

The AI Act is seen as more than just a set of rules; it's viewed as a launchpad for EU startups and researchers to lead in trustworthy AI, promoting innovation along the AI value chain. The agreement, once formalized by EU member states and the parliament, will be a critical step toward establishing landmark AI policy in the EU, potentially influencing global AI regulation trends

The European Union's forthcoming AI Act, which is expected to be adopted by the end of 2023, will introduce significant regulations surrounding the use of artificial intelligence (AI) in various domains, including the workplace. This groundbreaking legislation carries implications for a wide range of AI applications, including the automatic generation of job descriptions.

The Act categorizes AI systems based on their risk levels, with a specific focus on high-risk AI systems. These systems include those used in recruitment, employee evaluation, task allocation, and decision-making related to promotion or termination. For such high-risk systems, the AI Act imposes strict compliance obligations to eliminate bias and ensure equal treatment and non-discrimination. This is particularly relevant to AI-generated job descriptions, as they fall under the domain of recruitment and employment.

Furthermore, the Act requires that any automated decisions made by AI systems must be verified before initiating formal processes, especially those with significant impacts like termination. This ensures that AI-generated job descriptions and related decision-making processes adhere to legal standards, particularly regarding fairness and transparency.

Additionally, the AI Act places a central focus on preventing bias within AI systems. It broadens existing obligations to ensure that AI tools are free from any bias, which is crucial for AI-generated job descriptions that need to be fair and non-discriminatory.

While the Act does not explicitly address generative AI (GAI), the European Parliament's amendments attempt to address the issue of GAI by focusing on foundation models. This includes a separate risk category and specific obligations for providers of such models. Generative AI, which includes systems intended to generate content such as complex text (like job descriptions), will have to comply with transparency obligations and ensure safeguards against generating content in breach of Union law.

Given these regulations, using AI for generating job descriptions would require careful consideration to ensure compliance with the Act. This includes a focus on eliminating biases, ensuring transparency, and adhering to the specific requirements laid out for high-risk AI systems. As the Act is yet to be formally adopted and its final form may still be subject to changes, it will be important for employers and AI tool providers to stay informed and prepared to adapt to these new regulations.

Thus, we observe that incorrect usage is shielded by the OpenAI company itself and also by such legislation. Even if, for instance, hybrid systems or a proprietary system are developed and applied in one instance, it will still be regulated under the upcoming law in the near future.

Sources:

HR Optimisation: Navigating the EU AI Act: Implications for Recruiters and Employers
Deloitte: Generative AI Legal Issues
European Law Blog: The EU AI Act at a crossroads: generative AI as a challenge for regulation

Links

Open-source tokenizer ( Well working opensource tokenizer )

https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken

https://platform.openai.com/docs/quickstart?context=python

https://spacy.io/api/large-language-models

https://github.com/sainivarsha97/spacy-Tutorial/blob/master/Text%20Summarization%20using%20spaCy.ipynb

https://www.linkedin.com/pulse/european-ai-act-here-thierry-breton-gcnre%3FtrackingId=KrVzMnwYQqGRi4BCIFuJ%252Fg%253D%253D/?trackingId=KrVzMnwYQqGRi4BCIFuJ%2Fg%3D%3D

Search This Blog