Day 9

TIL - Selenium

๐Ÿ“‹ย ๊ณต๋ถ€ ๋‚ด์šฉ

Selenium?

Selenium

์›น ๋ธŒ๋ผ์šฐ์ €๋ฅผ ์กฐ์ž‘ํ•  ์ˆ˜ ์žˆ๋Š” ์ž๋™ํ™” ํ”„๋ ˆ์ž„์›Œํฌ

WebDriver

์›น ๋ธŒ๋ผ์šฐ์ €์™€ ์—ฐ๋™ํ•˜๊ณ  ์ œ์–ดํ•˜๋Š” ์ž๋™ํ™” ํ”„๋ ˆ์ž„์›Œํฌ

์„ค์น˜ ๋ฐฉ๋ฒ•

1
2
pip3 install selenium
pip3 install webdriver-manager

Selenium ํ™œ์šฉ

import

  • Selenium, webdriver ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
1
2
3
4
5
6
7
# web browswer์™€ ์ง์ ‘ ์—ฐ๊ฒฐ
from selenium import webdriver
# ํฌ๋กฌ ๊ฐ์ฒด๋ฅผ ๋„ฃ์„๋•Œ ์ธ์ž๋กœ ๋„ฃ์–ด์ฃผ๊ฒŒ ๋จ
from selenium.webdriver.chrome.service import Service
# ์‚ฌ์šฉ์ค‘์ธ ํฌ๋กฌ๊ณผ ๋™์ผํ•œ ๋ฒ„์ „์œผ๋กœ ์‹ฑํฌํ•˜๊ธฐ ์œ„ํ•จ
# ex) firefox - firefox driver manager 
from webdriver_manager.chrome import ChromeDriverManager

driver ์ƒ์„ฑ ๋ฐ request&response

  • Chrome ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ํฌ๋กฌ์ฐฝ์„ ์‹คํ–‰
1
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
  • .get(url) ์œผ๋กœ request ๋ณด๋ƒ„
1
2
3
url = "https://www.example.com"
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get(url)
  • .page_source๋กœ Response HTML ๋ฌธ์„œ ํ™•์ธ
1
2
3
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get(url)
print(driver.page_source)
  • with-as ๊ตฌ๋ฌธ์œผ๋กœ driver ์ž๋™ ์ข…๋ฃŒ
    • ์ž‘์„ฑํ•œ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•œ ํ›„ driver ์ข…๋ฃŒ๋˜๋ฉฐ ํฌ๋กฌ ์ฐฝ๋„ ๊บผ์ง
1
2
3
with webdriver.Chrome(service=Service(ChromeDriverManager().install())) as driver :
    driver.get(url)
    print(driver.page_source)

HTML ์š”์†Œ ์ถ”์ถœ

  • HTML ํŠน์ • ์š”์†Œ ์ถ”์ถœ
    • .find_element(by, target)
    • .find_elements(by, target)
    • by : By.ID, BY.TAG_NAME, BY.CLASS_NAME, BY.XPATH …
1
from selenium.webdriver.commom.by import By
  • example 1
1
2
3
4
5
6
7
with webdriver.Chrome(service=Service(ChromeDriverManager().install())) as driver :
    driver.get(url)
    p1 = driver.find_element(By.TAG_NAME, "p")
    p_list = driver.find_elements(By.TAG_NAME, "p")
    print(p1.text)
    for p in p_list:
        print(p.text)
  • example 2

    • ์ฐพ์œผ๋ ค๋Š” ์š”์†Œ๋ฅผ XPath๋ฅผ ํ†ตํ•ด ์ฐพ์œผ๋ ค๋Š” ๊ฒฝ์šฐ
    1
    2
    3
    
    path = '//*[@id="__next"]/div/main/div[2]/div/div[4]/div[1]/div[1]/div/a/div[2]/p[1]'
    result = driver.find_element(By.XPATH, path)
    print(result.text)
    
    • ์œ ์‚ฌํ•œ XPath๋ฅผ ๊ฐ€์ง„ ์—ฌ๋Ÿฌ ์š”์†Œ๋“ค์„ ๊ฐ€์ ธ์˜ค๋Š” ๊ฒฝ์šฐ
    1
    2
    3
    4
    5
    
    # ๋ฐ˜๋ณต๋˜๋Š” ๋ถ€๋ถ„์„ ์ œ์™ธํ•˜๊ณ  ๋ณ€ํ•˜๋Š” ๋ถ€๋ถ„์„ {}๋กœ ์ ์€ ํ›„ .format() ํ™œ์šฉ
    path = '//*[@id="__next"]/div/main/div[2]/div/div[4]/div[1]/div[{}]/div/a/div[2]/p[1]'
    for i in range(1, 11):
        result = driver.find_element(By.XPATH, path.format(i))
        print(result.text)
    

Wait and Call

๋™์  ์›นํŽ˜์ด์ง€๋ฅผ ์Šคํฌ๋ž˜ํ•‘ํ•˜๊ธฐ ์œ„ํ•ด ์ผ์ • ์‹œ๊ฐ„์„ ๊ธฐ๋‹ค๋ฆผ(Wait)

  • Implicit Wait

    • .implicitly_wait({num}) : num์ดˆ๋™์•ˆ ๋กœ๋”ฉ์„ ๊ธฐ๋‹ค๋ฆฌ๋Š”๋ฐ ๊ทธ ์ „์— ์™„์ „ํ•œ ์‘๋‹ต์ด ์˜ค๋ฉด ๋‹ค์Œ์œผ๋กœ ์ง„ํ–‰
    1
    2
    3
    4
    5
    
    from selenium.webdriver.support.ui import WebDriverWait
    
    driver.get(url)
    driver.implicitly_wait(10) #10์ดˆ
    result = driver.find_element(By.XPATH, path)
    
  • Explicit Wait

    • WebDriverWait()
      • until() : ์กฐ๊ฑด์ด ๋งŒ์กฑ๋  ๋•Œ๊นŒ์ง€
      • until_not() : ์กฐ๊ฑด์ด ๋งŒ์กฑ๋˜์ง€ ์•Š์„๋•Œ๊นŒ์ง€
      • expected_conditions(EC) : selenium์— ์ •์˜๋œ ์กฐ๊ฑด๋“ค
    1
    2
    3
    4
    5
    6
    7
    
    from selenium.webdriver.support import expected_conditions as EC
    
    with webdriver.Chrome(service = Service(ChromeDriverManager().install())) as driver:
        driver.get(url)
        # XPath ๊ฐ€ path ์ธ ์š”์†Œ๊ฐ€ ์กด์žฌํ• ๋•Œ๊นŒ์ง€ ์ตœ๋Œ€ 10์ดˆ๋™์•ˆ ๊ธฐ๋‹ค๋ฆผ
        element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, path)))
        print(element.text)
    

Events

  • ๋งˆ์šฐ์Šค ์ด๋ฒคํŠธ

    • ๋งˆ์šฐ์Šค ์›€์ง์ด๊ธฐ, ๋งˆ์šฐ์Šค ๋ˆ„๋ฅด๊ธฐ, ๋งˆ์šฐ์Šค ๋–ผ๊ธฐ ๋“ฑ ์—ฌ๋Ÿฌ ์ด๋ฒคํŠธ๊ฐ€ ์ผ์–ด๋‚  ์ˆ˜ ์žˆ์Œ
    • ํŠน์ • ๋ฒ„ํŠผ์„ ์ฐพ์€ ํ›„ ํด๋ฆญํ•˜๋Š” ์ฝ”๋“œ
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    
    # ๋‹ค๋ฅธ import ๊ณผ์ •์€ ์ƒ๋žต 
    from selenium.webdriver import ActionChains
    
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    driver.get(url)
    driver.implicitly_wait(0.5)
    # class ๋‘ ๊ฐœ ์ด์ƒ์ธ ๊ฒฝ์šฐ .์œผ๋กœ ์—ฐ๊ฒฐํ•ด ๋™์‹œ์— ์ฐธ์กฐ ๊ฐ€๋Šฅ 
    # class ์ด๋ฆ„์œผ๋กœ ๋ฒ„ํŠผ์„ ์ฐพ์Œ
    button = driver.find_element(By.CLASS_NAME, 'UtilMenustyle__Link-sc-2sjysx-4.ewJwEL')
    # ์ฐพ์€ ๋ฒ„ํŠผ์„ ๋งˆ์šฐ์Šค๋กœ `click`ํ•˜๋Š” ์ด๋ฒคํŠธ๋ฅผ ์‹คํ–‰
    ActionChains(driver).click(button).perform()
    
  • ํ‚ค๋ณด๋“œ ์ด๋ฒคํŠธ

    • ํ‚ค๋ณด๋“œ ๋ˆ„๋ฅด๊ธฐ, ํ‚ค๋ณด๋“œ ๋–ผ๊ธฐ ๋“ฑ ์—ฌ๋Ÿฌ ์ด๋ฒคํŠธ๊ฐ€ ์ผ์–ด๋‚  ์ˆ˜ ์žˆ์Œ

    • ํŠน์ • ์ž…๋ ฅ์ฐฝ์„ ์ฐพ์€ ํ›„ ์š”์†Œ์— ์ž…๋ ฅํ•˜๋Š” ์ฝ”๋“œ

    1
    2
    3
    4
    5
    6
    7
    
    # ๋‹ค๋ฅธ import ๊ณผ์ •์€ ์ƒ๋žต
    from selenium.webdriver import ActionChains, Keys
    
    # XPath๋ฅผ ํ†ตํ•ด id ์ž…๋ ฅํ•˜๋Š” input ํƒœ๊ทธ๋ฅผ ์ฐพ์Œ
    id_input = driver.find_element(By.XPATH, id_path)
    # {your_id}๋ฅผ input์— ์ž…๋ ฅํ•˜๋Š” ์ด๋ฒคํŠธ๋ฅผ ์‹คํ–‰
    ActionChains(driver).send_keys_to_element(id_input, {your_id}).perform()    
    
  • ๋กœ๊ทธ์ธ ์ž๋™ํ™” ์˜ˆ์‹œ

    • ๋กœ๊ทธ์ธ ํŽ˜์ด์ง€๋กœ ์ด๋™ ํ›„, ์•„์ด๋””์™€ ๋น„๋ฐ€๋ฒˆํ˜ธ๋ฅผ ์ž…๋ ฅํ•˜๊ณ  ๋กœ๊ทธ์ธ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋Š” ์ฝ”๋“œ
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    
    import time
    
    # url, id_path, pw_path ๋“ฑ์€ ์‚ฌ์ดํŠธ๋งˆ๋‹ค ๋‹ฌ๋ผ์ง
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    driver.get(url)
    time.sleep(1)
    
    driver.implicitly_wait(0.5)
    button = driver.find_element(By.CLASS_NAME, {class_name_1})
    ActionChains(driver).click(button).perform()
    time.sleep(1)
    
    id_input = driver.find_element(By.XPATH, id_path)
    ActionChains(driver).send_keys_to_element(id_input, {your_id}).perform()
    time.sleep(1)
    
    pw_input = driver.find_element(By.XPATH, pw_path)
    ActionChains(driver).send_keys_to_element(pw_input, {your_password}).perform()
    time.sleep(1)
    
    login_button = driver.find_element(By.CLASS_NAME, {class_name_2})
    ActionChains(driver).click(login_button).perform()
    

๐Ÿ‘€ย CHECK

(์–ด๋ ต๊ฑฐ๋‚˜ ์ƒˆ๋กญ๊ฒŒ ์•Œ๊ฒŒ ๋œ ๊ฒƒ ๋“ฑ ๋‹ค์‹œ ํ™•์ธํ•  ๊ฒƒ๋“ค)

  • selenium

    • mouse events, keyboard events ์ฐพ์•„๋ณด๊ธฐ
    • EC(expected condition)์˜ ๋‹ค๋ฅธ ์กฐ๊ฑด์—๋Š” ์–ด๋–ค๊ฒŒ ์žˆ๋Š”์ง€ ์ฐพ์•„ ์ •๋ฆฌํ•ด๋ณด๊ธฐ
  • ์—ฌ๋Ÿฌ class๋ฅผ ๊ฐ€์ง„ ์š”์†Œ ์ฐพ๊ธฐ

    • class_name_1.class_name_2์ฒ˜๋Ÿผ .์œผ๋กœ ์—ฐ๊ฒฐํ•˜์—ฌ ๋™์‹œ์— ์ฐธ์กฐํ•  ์ˆ˜ ์žˆ์Œ

โ— ๋Š๋‚€ ์ 

์˜ค๋Š˜๋„ 2์‹œ๊ฐ„ ๋ฐ˜ ์ •๋„์— ๊ฐ•์˜์™€ ์‹ค์Šต์„ ๋ชจ๋‘ ๋๋ƒˆ๋‹ค. ๊ณ„์† ์ผ์ฐ ๋๋‚˜๊ณ  ์‰ฌ์šด ๋‚ด์šฉ๋งŒ ๋‚˜์˜ค๋‹ˆ๊นŒ ๋„ˆ๋ฌด ๊ฒ‰ํ•ฅ๊ธฐ๋กœ๋งŒ ๋ฐฐ์šฐ๋Š” ๊ฒƒ ๊ฐ™๊ณ , ํ˜ผ์ž์„œ ๋” ๊ณต๋ถ€ํ•ด์•ผํ• ๊นŒ ๋ถˆ์•ˆ๊ฐ๋„ ์กฐ๊ธˆ ๋“ค์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์˜ค๋Š˜์€ TIL์„ ์ ์€ ๋‹ค์Œ ๋ฐฐ์šด๊ฒƒ๋“ค์„ ํ™œ์šฉํ•ด KBO์‚ฌ์ดํŠธ์—์„œ ์Šคํƒฏ์„ ์ถ”์ถœํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. ๋‹ค ์ง„ํ–‰ํ•˜๊ณ  ๋‚˜์„œ ๋”ฐ๋กœ ์ •๋ฆฌํ•˜๊ณ  ๊ธ€์„ ์จ์„œ ์˜ฌ๋ฆด ์˜ˆ์ •์ด๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์–ด์ œ ๋ธ”๋กœ๊ทธ๋ฅผ ์„ธํŒ…ํ•œ ์ดํ›„์— ๊ธฐ์กด์— TIL์„ ์˜ฌ๋ฆฌ๋˜ velog๋ฅผ ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ์ง€๋„ ๊ณ ๋ฏผํ•˜๊ณ  ์žˆ๋‹ค. ๋‹ค ์ง€์šฐ๊ธฐ๋„ ์•„๊น๊ณ  ๋” ์•ˆ์˜ฌ๋ฆฌ์ž๋‹ˆ ๋ฉˆ์ถฐ์ ธ์žˆ๋Š” ๋ธ”๋กœ๊ทธ ๊ฐ™์•„์„œ.. ๋ฐฑ์—…์šฉ์œผ๋กœ ์ฃผ๋ง์— ๋ชฐ์•„์„œ ์ผ์ฃผ์ผ์น˜๋ฅผ ์˜ฌ๋ฆฌ๋Š” ๋ฐฉ๋ฒ•๋„ ๊ณ ๋ฏผ์ค‘์ด๋‹ค!

์•„์ง ์ „๋ฐ˜์ ์œผ๋กœ ์‰ฌ์šดํŽธ์ด๊ธฐ๋Š” ํ•˜์ง€๋งŒ, ์ ์  ๋” ํฅ๋ฏธ๋กœ์›Œ์ง€๊ณ  ์žˆ๋‹ค. ์ดํ›„์— ๋ฐฐ์šธ ๋‚ด์šฉ๋“ค๊ณผ ๋‹ค์Œ๋‹ฌ ์ดˆ์— ํ•˜๊ฒŒ ๋  ํ”„๋กœ์ ํŠธ๊ฐ€ ๊ธฐ๋Œ€๋œ๋‹ค. ๐Ÿค—

Hugo๋กœ ๋งŒ๋“ฆ
Jimmy์˜ Stack ํ…Œ๋งˆ ์‚ฌ์šฉ ์ค‘