本帖最後由 hardrock 於 2013-11-22 14:34 編輯 3 a6 s [& N( _- C) Q+ E
0 t: D' G0 V5 I: p8 t5 u0 A
robots.txt文件要放在網站根目錄下,最基本的檢查方法就是用你的域名後面直接跟上robots.txt訪問,如果能訪問,那放置的位置就對了。6 f1 V$ Q3 f- T& _) c: H
4 i, W! }8 G: ^" T; \5 q$ }
找到份代碼,- User-agent: *8 \1 u& b* z' J/ j
- Disallow: /cgi-bin/
/ q l$ e2 F1 [7 H7 v' I! ~+ ]) J - Disallow: /wp-admin/) f9 [) J/ J5 q4 u V. D$ ~
- Disallow: /wp-content/cache/
' Y/ ]8 r- B: Z/ _4 A5 d% v: }; `/ R - Disallow: /wp-content/languages/
1 Q$ m* `; a; x3 @- a - Disallow: /wp-content/plugins/
/ H/ K; \! |/ u) @2 S - Disallow: /wp-content/themes/9 l7 N5 Y" B% Y' w
- Disallow: /wp-content/upgrade/. J6 a! o7 k& w4 O7 B
- Disallow: /wp-includes/
5 u* V# `) m4 G) T( L3 M# k - Disallow: /comments/( p# k+ g8 U) H6 X" [) B- r
- Disallow: /category/
0 z! Z1 d* [5 P8 N - Disallow: /tag/
' E4 |5 O8 z. _+ \9 \( C" `9 t7 v, i - Disallow: /page/
3 W/ i$ u5 w$ y1 Y' l - Disallow: /feed/
( y; t$ z: E& U) X& N# o" I5 s% L - Disallow: /author/$ x8 ~4 V- ?! r8 O0 f! i' @
- Disallow: /trackback/
" e& W; U& ?0 \( }1 `/ M: a - Disallow: /2010/5 x4 U. R+ Z+ W" ?: @4 Z
- Disallow: /2011/
1 c' J, z: |: i$ S8 `9 [. Q& R+ o - Disallow: /2012/+ B5 X {$ L- A0 E s' s
- Disallow: /2013/; R7 C' k1 s2 D) h4 A! k
- Disallow: /*/feed/
; K1 W1 O1 |- ] - Disallow: /*/trackback/
; X: Z& ?" y5 `5 z - Disallow: /*?( O: x0 F# b- |5 E* \$ u f9 ~
- Disallow: /*/*?
* X' B) \2 e7 l - Disallow: /*/*/*?. w* {, u+ e4 k2 p# L
- Disallow: /*.php$
0 H% J, \0 z8 ~% V1 ?2 g$ r9 N% d/ B9 L - Disallow: /*.js$, m% w1 N+ U2 ]9 V0 L
- Disallow: /*.inc$
; l1 X* o4 C6 ?2 r - Disallow: /*.css$
( p/ j9 b# m4 C7 m, E+ e; r3 h3 x - ) w5 z2 D& ^6 _8 W) G# z+ T% P5 B
- # Google Image1 t- V& m/ T- W6 Y
- User-agent: Googlebot-Image
) k' b. m) Q z( ]3 S. W+ V - Disallow:
; _) N9 s/ H* @6 \/ a# V/ N. k6 p - Allow: /
7 U, m2 e/ X+ Q' u1 E# O - 8 ?: j, ?, E' Y. x3 n' S7 t1 ^
- # Google AdSense
" T$ t! v) H5 k: k' C$ r - User-agent: Mediapartners-Google* j- V# x4 M2 `9 |1 B# ?
- Disallow:6 @4 d! ^+ c7 X# Q- e2 ?
- Allow: /) E! d1 u/ u6 ?. x/ M3 Q
- 8 ]% v3 m( o) F1 O- O. v( r
- # digg mirror9 d) F; Q' j3 w0 T
- User-agent: duggmirror' {, t- F/ e# C0 \3 L7 q+ `$ ^
- Disallow: /( s4 f/ n; E5 [3 W( V. b0 c- D
-
' |$ h: X8 O! W& X. V - # Alexa archiver; O! B6 ^+ G$ `6 e% ?3 i0 w
- User-agent: ia_archiver/ p5 Z% s/ `8 z9 P |* }( f) h3 H9 \
- Disallow: /
( m: L3 H- ~ Y -
0 J6 z& K( B4 a. p - Sitemap:http://www.xxx.com/sitemap.xml0 L6 t. p6 z4 [+ J2 V7 g5 i
- Sitemap:http://www.xxx.com/sitemap_baidu.xml
複製代碼 問題是這份代碼適用於中文站用於百度,我是做英文站要適用於google, 以上代碼怎樣改成適用英文站的?
! m+ T, h4 x# X4 d5 d8 a$ `對於代碼 一竅不通...
; u8 F. ]6 v! }, w5 v! V9 a
! H; @. T; i4 W. R4 v主要疑問是31----47行的代碼,既然是英文站,這幾行代碼應該是允許的吧?中文站才禁止抓取?
0 q% E, o7 g5 e, f0 n; \# d2 W2 g, e, P; U. V
+ o3 m# d8 }) N& B* W
2 D# B* `: S8 ?0 f6 z( h" b
6 O- p7 r: S2 {補充內容 (2013-12-22 17:43):
% U& k7 X, L, I沒這麼複雜,下面的就可以了. F9 O2 c* y3 ^& J
Sitemap: hxxp://www.xxx.com/sitemap.xml
/ L8 ?' ?: V: }% WUser-agent: *8 D+ J3 `* `5 `5 J. A* e
Disallow: /cgi-bin/
- f5 p# B5 T i" |- L7 }Disallow: /wp-*
( l% j% x0 J: z) T( B- ?& C3 l
補充內容 (2013-12-27 17:17):
/ @9 c2 w% u4 Zhttp://blog.csdn.net/wallacer/article/details/654289 |