Here in this article we can see how we can Integrate Gemini AI In our python code.
- We need to install speech recognition library
- Now we need to recognize voice from the user, after recognize we can convert voice into the text, and then we can pass this text to Gemini AI.
- For connectivity with gemini we required API secret key.
- we can find it in Gemini official Developer Documentation
- You need to create account and generate the API key, For API key you need to select the project,
- You can create project in Google Cloud like Firebase.
- Once you select the project it will generate the API key.
Source Code –
# PYTHON FOR VOICE
# 1. GET VOICE COMMAND FROM USER
#AIzaSyBakQATervFkxwe8e4YIz8E2KCU1yzvDb4
import speech_recognition as sr
import google.generativeai as genai
import os
from gtts import gTTS
import pyttsx3
recognizer = sr.Recognizer()
API_KEY=”AIzaSyBakQATervFkxwe8e4YIz8E2KCU1yzvDb4″
genai.configure(api_key=API_KEY)
def capture_voice_input():
with sr.Microphone() as source:
print(“Listening…”)
audio = recognizer.listen(source)
return audio
def convert_voice_to_text(audio):
try:
text = recognizer.recognize_google(audio)
print(“You said: ” + text)
except sr.UnknownValueError:
text = “”
print(“Sorry, I didn’t understand that.”)
except sr.RequestError as e:
text = “”
print(“Error; {0}”.format(e))
return text
def process_voice_command(text):
if “stop” in text.lower():
return False
else:
model = genai.GenerativeModel(“gemini-1.5-flash”)
response = model.generate_content(text)
print(response.text)
#tts = gTTS(text=response.text, lang=’en’)
#tts.save(“output.mp3”)
engine = pyttsx3.init()
engine.say(response.text)
engine.runAndWait()
def main():
end_program = False
while not end_program:
audio = capture_voice_input()
text = convert_voice_to_text(audio)
end_program = process_voice_command(text)
if __name__ == “__main__”:
main()
Please feel free to add any comment if any doubt.
Thank You.