{"id":11649,"date":"2025-03-12T11:07:49","date_gmt":"2025-03-12T04:07:49","guid":{"rendered":"https:\/\/tenten.vn\/ai\/?p=11649"},"modified":"2025-03-12T11:07:49","modified_gmt":"2025-03-12T04:07:49","slug":"cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code","status":"publish","type":"post","link":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/","title":{"rendered":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Ch\u1eafc h\u1eb3n nhi\u1ec1u b\u1ea1n l\u1ea5y d\u1eef li\u1ec7u \u0111\u1ec1u g\u1eb7p ph\u1ea3i tr\u01b0\u1eddng h\u1ee3p ph\u1ed5 bi\u1ebfn n\u00e0y. Kh\u00f4ng API, kh\u00f4ng th\u1ec3 truy c\u1eadp c\u01a1 s\u1edf d\u1eef li\u1ec7u, v\u00e0 m\u1ed9t trang web v\u1edbi &#8220;b\u1ee9c t\u01b0\u1eddng&#8221; b\u1ea3o v\u1ec7 nh\u01b0 l\u00e0: reCAPTCHA kh\u00f3 nh\u1eb1n, ng\u0103n ch\u1eb7n \u0111\u1ecba l\u00fd, gi\u1edbi h\u1ea1n y\u00eau c\u1ea7u theo th\u1eddi gian, v\u00e0 c\u1ea5u tr\u00fac HTML \u0111\u01b0\u1ee3c thi\u1ebft k\u1ebf \u0111\u1ec3 ch\u1ed1ng l\u1ea1i b\u1ea1n. Trong b\u00e0i vi\u1ebft n\u00e0y, <\/span><a href=\"https:\/\/tenten.vn\/vi\"><span style=\"font-weight: 400;\">Tenten.vn<\/span><\/a><span style=\"font-weight: 400;\"> s\u1ebd chia s\u1ebb v\u1edbi b\u1ea1n c\u00e1ch v\u01b0\u1ee3t qua nh\u1eefng tr\u1edf ng\u1ea1i t\u01b0\u1edfng ch\u1eebng kh\u00f4ng th\u1ec3 \u1ea5y. B\u1ea1n kh\u00f4ng c\u1ea7n ph\u1ea3i l\u00e0 m\u1ed9t ng\u01b0\u1eddi bi\u1ebft nhi\u1ec1u ng\u00f4n ng\u1eef l\u1eadp tr\u00ecnh nh\u01b0ng v\u1eabn x\u1eed l\u00fd \u0111\u01b0\u1ee3c. H\u00e3y c\u00f9ng kh\u00e1m ph\u00e1 nh\u00e9!<\/span><\/p>\n<h2><b>V\u01b0\u1ee3t qua reCAPTCHA v\u00e0 ph\u00e1t hi\u1ec7n bot<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Nhi\u1ec1u trang web hi\u1ec7n \u0111\u1ea1i s\u1eed d\u1ee5ng <\/span><a href=\"https:\/\/tenten.vn\/tin-tuc\/recaptcha-la-gi\/\"><span style=\"font-weight: 400;\">reCAPTCHA<\/span><\/a><span style=\"font-weight: 400;\"> \u0111\u1ec3 ng\u0103n ch\u1eb7n c\u00e1c bot t\u1ef1 \u0111\u1ed9ng. C\u00f4ng c\u1ee5 n\u00e0y kh\u00f4ng ch\u1ec9 \u0111\u1ec3 ki\u1ec3m tra bot, m\u00e0 c\u00f2n ph\u00e2n t\u00edch r\u1ea5t k\u1ef9 c\u00e1c h\u00e0nh vi c\u1ee7a b\u1ea1n:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Chuy\u1ec3n \u0111\u1ed9ng chu\u1ed9t: N\u1ebfu chu\u1ed9t di chuy\u1ec3n theo \u0111\u01b0\u1eddng th\u1eb3ng ho\u1eb7c qu\u00e1 \u0111\u1ec1u \u0111\u1eb7n, h\u1ec7 th\u1ed1ng s\u1ebd nghi ng\u1edd bot.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">User Agent gi\u1ea3 m\u1ea1o: C\u00e1c tr\u00ecnh duy\u1ec7t gi\u1ea3 l\u1eadp th\u01b0\u1eddng s\u1eed d\u1ee5ng User Agent kh\u00f4ng h\u1ee3p l\u1ec7, d\u1ec5 b\u1ecb ph\u00e1t hi\u1ec7n.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">T\u1ed1c \u0111\u1ed9 t\u01b0\u01a1ng t\u00e1c: N\u1ebfu b\u1ea1n g\u1eedi y\u00eau c\u1ea7u qu\u00e1 nhanh ho\u1eb7c qu\u00e1 nhi\u1ec1u trong th\u1eddi gian ng\u1eafn, h\u1ec7 th\u1ed1ng s\u1ebd coi \u0111\u00f3 l\u00e0 d\u1ea5u hi\u1ec7u c\u1ee7a bot.<\/span><\/li>\n<\/ul>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter wp-image-11650 size-full\" src=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-1.jpg\" alt=\"Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-1\" width=\"600\" height=\"338\" srcset=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-1.jpg 600w, https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-1-300x169.jpg 300w, https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-1-390x220.jpg 390w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>V\u00e0 \u0111\u1ec3 v\u01b0\u1ee3t qua c\u00e1c c\u00f4ng c\u1ee5 ng\u0103n ch\u1eb7n bot, ch\u00fang ta c\u00f3<b> Puppeteer<\/b><span>. N\u00f3r cho ph\u00e9p b\u1ea1n m\u00f4 ph\u1ecfng h\u00e0nh vi c\u1ee7a m\u1ed9t ng\u01b0\u1eddi d\u00f9ng th\u1ef1c s\u1ef1, t\u1eeb vi\u1ec7c di chuy\u1ec3n chu\u1ed9t ng\u1eabu nhi\u00ean, thi\u1ebft l\u1eadp User Agent h\u1ee3p l\u1ec7, \u0111\u1ebfn vi\u1ec7c gi\u1ea3i quy\u1ebft reCAPTCHA th\u1ee7 c\u00f4ng.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">D\u01b0\u1edbi \u0111\u00e2y l\u00e0 m\u1ed9t \u0111o\u1ea1n m\u00e3 ngu\u1ed3n (JavaScript) \u0111\u01a1n gi\u1ea3n s\u1eed d\u1ee5ng Puppeteer \u0111\u1ec3 v\u01b0\u1ee3t qua reCAPTCHA v\u00e0 truy c\u1eadp d\u1eef li\u1ec7u:<\/span><\/p>\n<p><code><span style=\"font-weight: 400;\">const puppeteer = require('puppeteer');<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">async function bypassCaptcha() {<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\/\/ Kh\u1edfi t\u1ea1o tr\u00ecnh duy\u1ec7t (ch\u1ea1y \u1edf ch\u1ebf \u0111\u1ed9 hi\u1ec3n th\u1ecb \u0111\u1ec3 d\u1ec5 quan s\u00e1t)<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0const browser = await puppeteer.launch({ headless: false });<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0const page = await browser.newPage();<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\/\/ Thi\u1ebft l\u1eadp User Agent h\u1ee3p l\u1ec7 \u0111\u1ec3 gi\u1ea3 l\u1eadp tr\u00ecnh duy\u1ec7t th\u1eadt<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0await page.setUserAgent('Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/91.0.4472.124 Safari\/537.36');<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\/\/ T\u1ea1o chuy\u1ec3n \u0111\u1ed9ng chu\u1ed9t ng\u1eabu nhi\u00ean \u0111\u1ec3 qua m\u1eb7t reCAPTCHA<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0await page.mouse.move(100 + Math.random() * 500, 200 + Math.random() * 300);<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\/\/ Truy c\u1eadp trang web m\u1ee5c ti\u00eau<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0await page.goto('https:\/\/example.com');<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\/\/ \u0110\u00f3ng tr\u00ecnh duy\u1ec7t sau khi ho\u00e0n th\u00e0nh<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0await browser.close();<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">}<\/span><\/code><\/p>\n<p><b>C\u00e1ch s\u1eed d\u1ee5ng \u0111o\u1ea1n m\u00e3:<\/b><span style=\"font-weight: 400;\"> M\u1edf b\u1ea5t k\u1ef3 tr\u00ecnh so\u1ea1n th\u1ea3o v\u0103n b\u1ea3n n\u00e0o (nh\u01b0 Visual Studio Code, Sublime Text, ho\u1eb7c Notepad++), d\u00e1n \u0111o\u1ea1n m\u00e3 ngu\u1ed3n v\u00e0 l\u01b0u t\u00ean th\u00e0nh t\u00ean b\u1ea5t k\u1ef3 v\u00ed d\u1ee5 l\u00e0 <em>inedex.js<\/em>. Sau khi \u0111\u00e3 l\u01b0u file <em>index.js<\/em>, quay l\u1ea1i Terminal ho\u1eb7c Command Prompt, ch\u1ea1y l\u1ec7nh sau:<\/span><\/p>\n<p><code><span style=\"font-weight: 400;\">node index.js<\/span><\/code><\/p>\n<p><span style=\"font-weight: 400;\">N\u1ebfu m\u1ecdi th\u1ee9 \u0111\u01b0\u1ee3c thi\u1ebft l\u1eadp \u0111\u00fang, Puppeteer s\u1ebd kh\u1edfi ch\u1ea1y m\u1ed9t c\u1eeda s\u1ed5 tr\u00ecnh duy\u1ec7t, th\u1ef1c hi\u1ec7n c\u00e1c h\u00e0nh \u0111\u1ed9ng nh\u01b0 di chuy\u1ec3n chu\u1ed9t ng\u1eabu nhi\u00ean v\u00e0 truy c\u1eadp trang web m\u00e0 b\u1ea1n \u0111\u00e3 ch\u1ec9 \u0111\u1ecbnh.<\/span><\/p>\n<h2><b>Kh\u1eafc ph\u1ee5c ch\u1eb7n \u0111\u1ecba l\u00fd v\u00e0 gi\u1edbi h\u1ea1n t\u1ed1c \u0111\u1ed9<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">C\u00f3 \u0111\u00f4i l\u00fac, b\u1ea1n kh\u00f4ng th\u1ec3 truy c\u1eadp 1 trang web ho\u1eb7c b\u1ecb ch\u1eb7n sau v\u00e0i l\u1ea7n t\u1ea3i trang. \u0110\u00f3 l\u00e0 h\u1ecd ch\u1eb7n IP t\u1eeb c\u00e1c khu v\u1ef1c c\u1ee5 th\u1ec3 ho\u1eb7c gi\u1edbi h\u1ea1n s\u1ed1 l\u01b0\u1ee3ng y\u00eau c\u1ea7u (v\u00ed d\u1ee5: ch\u1ec9 cho ph\u00e9p 5 l\u1ea7n truy c\u1eadp trong 10 ph\u00fat).<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-11651 size-full\" src=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-2.jpg\" alt=\"Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-2\" width=\"600\" height=\"350\" srcset=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-2.jpg 600w, https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-2-300x175.jpg 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">\u0110\u1ec3 v\u01b0\u1ee3t qua ch\u1eb7n IP \u0111\u1ecba l\u00fd v\u00e0 gi\u1edbi h\u1ea1n t\u1ed1c \u0111\u1ed9, b\u1ea1n c\u00f3 th\u1ec3 s\u1eed d\u1ee5ng <\/span><a href=\"https:\/\/smartproxy.com\/blog\/why-rotating-proxies-are-the-best#:~:text=A%20rotating%20proxy%20is%20a%20proxy%20server%20that%20automatically%20switches,a%20different%20device%20or%20location.\"><b>rotating residential proxies<\/b><\/a><span style=\"font-weight: 400;\">. \u0110\u00e2y l\u00e0 nh\u1eefng proxy m\u00f4 ph\u1ecfng l\u01b0u l\u01b0\u1ee3ng truy c\u1eadp t\u1eeb c\u00e1c \u0111\u1ecba ch\u1ec9 IP th\u1ef1c c\u1ee7a ng\u01b0\u1eddi d\u00f9ng t\u1ea1i c\u00e1c khu v\u1ef1c c\u1ee5 th\u1ec3. B\u1eb1ng c\u00e1ch thay \u0111\u1ed5i IP sau m\u1ed7i v\u00e0i y\u00eau c\u1ea7u, b\u1ea1n c\u00f3 th\u1ec3 tr\u00e1nh b\u1ecb ph\u00e1t hi\u1ec7n v\u00e0 ng\u0103n ch\u1eb7n.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">D\u01b0\u1edbi \u0111\u00e2y l\u00e0 m\u1ed9t \u0111o\u1ea1n m\u00e3 ngu\u1ed3n Python \u0111\u01a1n gi\u1ea3n s\u1eed d\u1ee5ng <\/span><span style=\"font-weight: 400;\">rotating residential proxies<\/span><span style=\"font-weight: 400;\">. \u0111\u1ec3 kh\u1eafc ph\u1ee5c v\u1ea5n \u0111\u1ec1 n\u00e0y:<\/span><\/p>\n<p><code><span style=\"font-weight: 400;\">import requests<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">from time import sleep<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\"># <\/span><span style=\"font-weight: 400;\">rotating residential proxies<\/span><span style=\"font-weight: 400;\"> (thay th\u1ebf b\u1eb1ng proxy th\u1ef1c t\u1ebf c\u1ee7a b\u1ea1n)<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">proxy_list = [<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\"http:\/\/user:pass@br.proxy.example.com:8080\",<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\"http:\/\/user:pass@us.proxy.example.com:8080\",<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0# Th\u00eam c\u00e1c proxy kh\u00e1c v\u00e0o \u0111\u00e2y...<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">]<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">def scrape_safe(url):<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0for proxy in proxy_list:<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0try:<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0# G\u1eedi y\u00eau c\u1ea7u v\u1edbi proxy hi\u1ec7n t\u1ea1i<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0response = requests.get(url, proxies={\"http\": proxy, \"https\": proxy})<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0if response.status_code == 200:<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0return response.text\u00a0 # Tr\u1ea3 v\u1ec1 n\u1ed9i dung trang web n\u1ebfu th\u00e0nh c\u00f4ng<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0sleep(10)\u00a0 # Th\u00eam \u0111\u1ed9 tr\u1ec5 10 gi\u00e2y gi\u1eefa c\u00e1c y\u00eau c\u1ea7u<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0except Exception as e:<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0print(f\"L\u1ed7i v\u1edbi proxy {proxy}: {e}\")<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0continue\u00a0 # Chuy\u1ec3n sang proxy ti\u1ebfp theo n\u1ebfu l\u1ed7i x\u1ea3y ra<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0return None\u00a0 # Tr\u1ea3 v\u1ec1 None n\u1ebfu t\u1ea5t c\u1ea3 proxy \u0111\u1ec1u th\u1ea5t b\u1ea1i<\/span><\/code><\/p>\n<p><b>C\u00e1ch s\u1eed d\u1ee5ng \u0111o\u1ea1n m\u00e3:<\/b><span style=\"font-weight: 400;\"> M\u1edf b\u1ea5t k\u1ef3 tr\u00ecnh so\u1ea1n th\u1ea3o v\u0103n b\u1ea3n n\u00e0o (nh\u01b0 Visual Studio Code, Sublime Text, ho\u1eb7c Notepad++), d\u00e1n \u0111o\u1ea1n m\u00e3 ngu\u1ed3n v\u00e0 l\u01b0u t\u00ean th\u00e0nh t\u00ean b\u1ea5t k\u1ef3 <\/span><i><span style=\"font-weight: 400;\">v\u00ed d\u1ee5 Scraping.py<\/span><\/i><span style=\"font-weight: 400;\">. Quay l\u1ea1i Terminal ho\u1eb7c Command Prompt, \u0111\u1ea3m b\u1ea3o b\u1ea1n \u0111ang \u1edf trong th\u01b0 m\u1ee5c ch\u1ee9a file <em>scrape.py,<\/em> v\u00e0 ch\u1ea1y l\u1ec7nh sau:<\/span><\/p>\n<p><code>python scrape.py<\/code><\/p>\n<h2><b>X\u1eed l\u00fd HTML phi c\u1ea5u tr\u00fac<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Nhi\u1ec1u trang web hi\u1ec7n \u0111\u1ea1i s\u1eed d\u1ee5ng c\u00e1c framework nh\u01b0 React, Angular, ho\u1eb7c Vue.js, khi\u1ebfn c\u1ea5u tr\u00fac HTML c\u00f3 th\u1ec3 thay \u0111\u1ed5i linh ho\u1ea1t v\u00e0 l\u00e0m kh\u00f3 b\u1ea1n l\u1ea5y d\u1eef li\u1ec7u. V\u00ed d\u1ee5, m\u1ed9t danh s\u00e1ch s\u1ea3n ph\u1ea9m h\u00f4m nay n\u1eb1m trong th\u1ebb <\/span><code><span style=\"font-weight: 400;\">&lt;div class=\"product-list\"&gt;<\/span><\/code><span style=\"font-weight: 400;\">, nh\u01b0ng ng\u00e0y mai l\u1ea1i chuy\u1ec3n sang <\/span><code><span style=\"font-weight: 400;\">&lt;section id=\"products\"&gt;<\/span><\/code><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-11652 size-full\" src=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-3.jpg\" alt=\"Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-3\" width=\"600\" height=\"300\" srcset=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-3.jpg 600w, https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-3-300x150.jpg 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">\u0110\u1ec3 v\u01b0\u1ee3t qua, ch\u00fang ta c\u00f3 th\u1ec3 s\u1eed d\u1ee5ng <\/span><a href=\"https:\/\/github.com\/unclecode\/crawl4ai\"><b>Crawl4AI<\/b><\/a><span style=\"font-weight: 400;\"> \u2013 m\u1ed9t c\u00f4ng c\u1ee5 m\u1ea1nh m\u1ebd k\u1ebft h\u1ee3p gi\u1eefa thu th\u1eadp d\u1eef li\u1ec7u v\u00e0 tr\u00ed tu\u1ec7 nh\u00e2n t\u1ea1o c\u00f9ng v\u1edbi m\u00f4 h\u00ecnh ng\u00f4n ng\u1eef <\/span><b>DeepSeek<\/b><span style=\"font-weight: 400;\">. Thay v\u00ec ph\u1ee5 thu\u1ed9c v\u00e0o c\u1ea5u tr\u00fac HTML c\u1ed1 \u0111\u1ecbnh, gi\u1ea3i ph\u00e1p n\u00e0y t\u1eadp trung v\u00e0o \u00fd ngh\u0129a c\u1ee7a d\u1eef li\u1ec7u, gi\u00fap b\u1ea1n tr\u00edch xu\u1ea5t th\u00f4ng tin linh ho\u1ea1t.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">D\u01b0\u1edbi \u0111\u00e2y l\u00e0 m\u1ed9t \u0111o\u1ea1n m\u00e3 Python \u0111\u01a1n gi\u1ea3n s\u1eed d\u1ee5ng Crawl4AI v\u00e0 DeepSeek \u0111\u1ec3 x\u1eed l\u00fd HTML phi c\u1ea5u tr\u00fac:<\/span><\/p>\n<p><code><span style=\"font-weight: 400;\">from crawl4ai import WebCrawler<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">from llama_cpp import Llama<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\"># Kh\u1edfi t\u1ea1o m\u00f4 h\u00ecnh ng\u00f4n ng\u1eef DeepSeek<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">llm = Llama(model_path=\"deepseek-1.3b.gguf\")<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\"># Kh\u1edfi t\u1ea1o WebCrawler<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">crawler = WebCrawler()<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\"># Ch\u1ea1y qu\u00e1 tr\u00ecnh c\u00e0o d\u1eef li\u1ec7u v\u1edbi chi\u1ebfn l\u01b0\u1ee3c s\u1eed d\u1ee5ng LLM<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">content = crawler.run(<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0url=\"https:\/\/example.com\",<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0strategy=\"llm\",\u00a0 # S\u1eed d\u1ee5ng chi\u1ebfn l\u01b0\u1ee3c d\u1ef1a tr\u00ean LLM<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0llm=llm,<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0prompt=\"Extract product names, prices into JSON\"<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">)<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\"># In ra d\u1eef li\u1ec7u \u0111\u00e3 tr\u00edch xu\u1ea5t<\/span><\/code><\/p>\n<p><code><span style=\"font-weight: 400;\">print(content.extracted_data)<\/span><\/code><\/p>\n<p><b>C\u00e1ch s\u1eed d\u1ee5ng \u0111o\u1ea1n m\u00e3:<\/b><span style=\"font-weight: 400;\"> M\u1edf b\u1ea5t k\u1ef3 tr\u00ecnh so\u1ea1n th\u1ea3o v\u0103n b\u1ea3n n\u00e0o (nh\u01b0 Visual Studio Code, Sublime Text, ho\u1eb7c Notepad++), d\u00e1n \u0111o\u1ea1n m\u00e3 ngu\u1ed3n v\u00e0 l\u01b0u t\u00ean th\u00e0nh t\u00ean b\u1ea5t k\u1ef3 <\/span><i><span style=\"font-weight: 400;\">v\u00ed d\u1ee5 crawler.py<\/span><\/i><span style=\"font-weight: 400;\">. Quay l\u1ea1i Terminal ho\u1eb7c Command Prompt, \u0111\u1ea3m b\u1ea3o b\u1ea1n \u0111ang \u1edf trong th\u01b0 m\u1ee5c ch\u1ee9a file <em>scrape.py<\/em>, v\u00e0 ch\u1ea1y l\u1ec7nh sau:<\/span><\/p>\n<p><code><span style=\"font-weight: 400;\">python crawler.py<\/span><\/code><\/p>\n<h2><b>K\u1ebft Lu\u1eadn<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Qua b\u00e0i, ch\u00fang ta \u0111\u00e3\u00a0 kh\u00e1m ph\u00e1 c\u00e1ch v\u01b0\u1ee3t qua nh\u1eefng r\u00e0o c\u1ea3n ph\u1ed5 bi\u1ebfn khi l\u1ea5y d\u1eef li\u1ec7u t\u1eeb c\u00e1c trang web. H\u00e3y nh\u1edb r\u1eb1ng, vi\u1ec7c l\u1ea5y d\u1eef li\u1ec7u c\u1ea7n tu\u00e2n th\u1ee7 c\u00e1c quy \u0111\u1ecbnh v\u00e0 ch\u00ednh s\u00e1ch c\u1ee7a trang web. H\u00e3y lu\u00f4n \u0111\u1ea3m b\u1ea3o r\u1eb1ng b\u1ea1n s\u1eed d\u1ee5ng d\u1eef li\u1ec7u m\u1ed9t c\u00e1ch h\u1ee3p ph\u00e1p v\u00e0 \u0111\u1ea1o \u0111\u1ee9c, tr\u00e1nh g\u00e2y \u1ea3nh h\u01b0\u1edfng ti\u00eau c\u1ef1c \u0111\u1ebfn h\u1ec7 th\u1ed1ng ho\u1eb7c tr\u1ea3i nghi\u1ec7m c\u1ee7a ng\u01b0\u1eddi d\u00f9ng kh\u00e1c.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">N\u1ebfu b\u1ea1n c\u00f3 b\u1ea5t k\u1ef3 kh\u00f3 kh\u0103n trong qu\u00e1 tr\u00ecnh th\u1ef1c hi\u1ec7n, h\u00e3y \u0111\u1ec3 l\u1ea1i l\u1eddi nh\u1eafn \u0111\u1ec3 ch\u00fang t\u00f4i gi\u00fap b\u1ea1n!<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ch\u1eafc h\u1eb3n nhi\u1ec1u b\u1ea1n l\u1ea5y d\u1eef li\u1ec7u \u0111\u1ec1u g\u1eb7p ph\u1ea3i tr\u01b0\u1eddng h\u1ee3p ph\u1ed5 bi\u1ebfn n\u00e0y. Kh\u00f4ng API, kh\u00f4ng th\u1ec3 truy c\u1eadp c\u01a1 s\u1edf d\u1eef li\u1ec7u, v\u00e0 m\u1ed9t trang web v\u1edbi &#8220;b\u1ee9c t\u01b0\u1eddng&#8221; b\u1ea3o v\u1ec7 nh\u01b0 l\u00e0: reCAPTCHA kh\u00f3 nh\u1eb1n, ng\u0103n ch\u1eb7n \u0111\u1ecba l\u00fd, gi\u1edbi h\u1ea1n y\u00eau c\u1ea7u theo th\u1eddi gian, v\u00e0 c\u1ea5u tr\u00fac HTML \u0111\u01b0\u1ee3c &hellip;<\/p>\n","protected":false},"author":41,"featured_media":11653,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[92,151,152,135],"class_list":["post-11649","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-chatgpt","tag-ai","tag-lay-du-lieu","tag-proxy","tag-scraping-website"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v24.6 (Yoast SEO v27.4) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/\" \/>\n<meta property=\"og:locale\" content=\"vi_VN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code\" \/>\n<meta property=\"og:description\" content=\"Ch\u1eafc h\u1eb3n nhi\u1ec1u b\u1ea1n l\u1ea5y d\u1eef li\u1ec7u \u0111\u1ec1u g\u1eb7p ph\u1ea3i tr\u01b0\u1eddng h\u1ee3p ph\u1ed5 bi\u1ebfn n\u00e0y. Kh\u00f4ng API, kh\u00f4ng th\u1ec3 truy c\u1eadp c\u01a1 s\u1edf d\u1eef li\u1ec7u, v\u00e0 m\u1ed9t trang web v\u1edbi &#8220;b\u1ee9c t\u01b0\u1eddng&#8221; b\u1ea3o v\u1ec7 nh\u01b0 l\u00e0: reCAPTCHA kh\u00f3 nh\u1eb1n, ng\u0103n ch\u1eb7n \u0111\u1ecba l\u00fd, gi\u1edbi h\u1ea1n y\u00eau c\u1ea7u theo th\u1eddi gian, v\u00e0 c\u1ea5u tr\u00fac HTML \u0111\u01b0\u1ee3c &hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/\" \/>\n<meta property=\"og:site_name\" content=\"Tenten AI\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/tentenvn.gmo\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-03-12T04:07:49+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Qu\u1ea3n tr\u1ecb vi\u00ean\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u0110\u01b0\u1ee3c vi\u1ebft b\u1edfi\" \/>\n\t<meta name=\"twitter:data1\" content=\"Qu\u1ea3n tr\u1ecb vi\u00ean\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u01af\u1edbc t\u00ednh th\u1eddi gian \u0111\u1ecdc\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 ph\u00fat\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/\"},\"author\":{\"name\":\"Qu\u1ea3n tr\u1ecb vi\u00ean\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#\\\/schema\\\/person\\\/cc27c88f21892a35616b0ac3784f537a\"},\"headline\":\"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code\",\"datePublished\":\"2025-03-12T04:07:49+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/\"},\"wordCount\":1333,\"publisher\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg\",\"keywords\":[\"ai\",\"l\u1ea5y d\u1eef li\u1ec7u\",\"Proxy\",\"scraping website\"],\"articleSection\":[\"ChatGPT\"],\"inLanguage\":\"vi\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/\",\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/\",\"name\":\"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg\",\"datePublished\":\"2025-03-12T04:07:49+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#breadcrumb\"},\"inLanguage\":\"vi\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"vi\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#primaryimage\",\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg\",\"contentUrl\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg\",\"width\":1200,\"height\":630,\"caption\":\"ach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Trang ch\u1ee7\",\"item\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#website\",\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/\",\"name\":\"Tenten AI\",\"description\":\"Chia s\u1ebb ki\u1ebfn th\u1ee9c v\u1ec1 AI\",\"publisher\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"vi\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#organization\",\"name\":\"Tenten AI\",\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"vi\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/autogpt-tenten-4.jpg\",\"contentUrl\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/autogpt-tenten-4.jpg\",\"width\":600,\"height\":498,\"caption\":\"Tenten AI\"},\"image\":{\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/tentenvn.gmo\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/#\\\/schema\\\/person\\\/cc27c88f21892a35616b0ac3784f537a\",\"name\":\"Qu\u1ea3n tr\u1ecb vi\u00ean\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"vi\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g\",\"caption\":\"Qu\u1ea3n tr\u1ecb vi\u00ean\"},\"sameAs\":[\"https:\\\/\\\/tenten.vn\\\/ai\\\/\"],\"url\":\"https:\\\/\\\/tenten.vn\\\/ai\\\/author\\\/duongnt4\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/","og_locale":"vi_VN","og_type":"article","og_title":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code","og_description":"Ch\u1eafc h\u1eb3n nhi\u1ec1u b\u1ea1n l\u1ea5y d\u1eef li\u1ec7u \u0111\u1ec1u g\u1eb7p ph\u1ea3i tr\u01b0\u1eddng h\u1ee3p ph\u1ed5 bi\u1ebfn n\u00e0y. Kh\u00f4ng API, kh\u00f4ng th\u1ec3 truy c\u1eadp c\u01a1 s\u1edf d\u1eef li\u1ec7u, v\u00e0 m\u1ed9t trang web v\u1edbi &#8220;b\u1ee9c t\u01b0\u1eddng&#8221; b\u1ea3o v\u1ec7 nh\u01b0 l\u00e0: reCAPTCHA kh\u00f3 nh\u1eb1n, ng\u0103n ch\u1eb7n \u0111\u1ecba l\u00fd, gi\u1edbi h\u1ea1n y\u00eau c\u1ea7u theo th\u1eddi gian, v\u00e0 c\u1ea5u tr\u00fac HTML \u0111\u01b0\u1ee3c &hellip;","og_url":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/","og_site_name":"Tenten AI","article_publisher":"https:\/\/www.facebook.com\/tentenvn.gmo\/","article_published_time":"2025-03-12T04:07:49+00:00","og_image":[{"width":1200,"height":630,"url":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg","type":"image\/jpeg"}],"author":"Qu\u1ea3n tr\u1ecb vi\u00ean","twitter_card":"summary_large_image","twitter_misc":{"\u0110\u01b0\u1ee3c vi\u1ebft b\u1edfi":"Qu\u1ea3n tr\u1ecb vi\u00ean","\u01af\u1edbc t\u00ednh th\u1eddi gian \u0111\u1ecdc":"6 ph\u00fat"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#article","isPartOf":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/"},"author":{"name":"Qu\u1ea3n tr\u1ecb vi\u00ean","@id":"https:\/\/tenten.vn\/ai\/#\/schema\/person\/cc27c88f21892a35616b0ac3784f537a"},"headline":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code","datePublished":"2025-03-12T04:07:49+00:00","mainEntityOfPage":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/"},"wordCount":1333,"publisher":{"@id":"https:\/\/tenten.vn\/ai\/#organization"},"image":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#primaryimage"},"thumbnailUrl":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg","keywords":["ai","l\u1ea5y d\u1eef li\u1ec7u","Proxy","scraping website"],"articleSection":["ChatGPT"],"inLanguage":"vi"},{"@type":"WebPage","@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/","url":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/","name":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code","isPartOf":{"@id":"https:\/\/tenten.vn\/ai\/#website"},"primaryImageOfPage":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#primaryimage"},"image":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#primaryimage"},"thumbnailUrl":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg","datePublished":"2025-03-12T04:07:49+00:00","breadcrumb":{"@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#breadcrumb"},"inLanguage":"vi","potentialAction":[{"@type":"ReadAction","target":["https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/"]}]},{"@type":"ImageObject","inLanguage":"vi","@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#primaryimage","url":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg","contentUrl":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2025\/03\/Cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail.jpg","width":1200,"height":630,"caption":"ach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code-thumbnail"},{"@type":"BreadcrumbList","@id":"https:\/\/tenten.vn\/ai\/cach-lay-du-lieu-trang-web-trong-1-not-nhac-ma-khong-can-biet-code\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Trang ch\u1ee7","item":"https:\/\/tenten.vn\/ai\/"},{"@type":"ListItem","position":2,"name":"C\u00e1ch l\u1ea5y d\u1eef li\u1ec7u trang web trong 1 n\u1ed1t nh\u1ea1c m\u00e0 kh\u00f4ng c\u1ea7n bi\u1ebft code"}]},{"@type":"WebSite","@id":"https:\/\/tenten.vn\/ai\/#website","url":"https:\/\/tenten.vn\/ai\/","name":"Tenten AI","description":"Chia s\u1ebb ki\u1ebfn th\u1ee9c v\u1ec1 AI","publisher":{"@id":"https:\/\/tenten.vn\/ai\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/tenten.vn\/ai\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"vi"},{"@type":"Organization","@id":"https:\/\/tenten.vn\/ai\/#organization","name":"Tenten AI","url":"https:\/\/tenten.vn\/ai\/","logo":{"@type":"ImageObject","inLanguage":"vi","@id":"https:\/\/tenten.vn\/ai\/#\/schema\/logo\/image\/","url":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2023\/04\/autogpt-tenten-4.jpg","contentUrl":"https:\/\/tenten.vn\/ai\/wp-content\/uploads\/2023\/04\/autogpt-tenten-4.jpg","width":600,"height":498,"caption":"Tenten AI"},"image":{"@id":"https:\/\/tenten.vn\/ai\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/tentenvn.gmo\/"]},{"@type":"Person","@id":"https:\/\/tenten.vn\/ai\/#\/schema\/person\/cc27c88f21892a35616b0ac3784f537a","name":"Qu\u1ea3n tr\u1ecb vi\u00ean","image":{"@type":"ImageObject","inLanguage":"vi","@id":"https:\/\/secure.gravatar.com\/avatar\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/aee149e9f11ff0626395dbcd263af1ded26baae4fc0e306cf4619410bd738ad6?s=96&d=mm&r=g","caption":"Qu\u1ea3n tr\u1ecb vi\u00ean"},"sameAs":["https:\/\/tenten.vn\/ai\/"],"url":"https:\/\/tenten.vn\/ai\/author\/duongnt4\/"}]}},"_links":{"self":[{"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/posts\/11649","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/comments?post=11649"}],"version-history":[{"count":2,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/posts\/11649\/revisions"}],"predecessor-version":[{"id":11655,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/posts\/11649\/revisions\/11655"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/media\/11653"}],"wp:attachment":[{"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/media?parent=11649"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/categories?post=11649"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tenten.vn\/ai\/wp-json\/wp\/v2\/tags?post=11649"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}