网络爬虫之java项目搭建

    xiaoxiao2025-06-05  16

    创建网络爬虫项目

    1、创建一个maven项目

    直接看图就好。

    2、修改目录:

    添加文件

    3、修改配置

    修改pom.xml 主要用到的就是httpclient。

    <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.crawlerTest</groupId> <artifactId>CrawlerTest</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient --> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.2</version> </dependency> <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12 --> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.7.25</version> <!-- <scope>test</scope>--> </dependency> </dependencies> </project>

    下载好依赖:

    4、在Test类中添加代码测试

    package com.crawlerTest; import org.apache.http.HttpEntity; import org.apache.http.client.methods.CloseableHttpResponse; import org.apache.http.client.methods.HttpGet; import org.apache.http.impl.client.CloseableHttpClient; import org.apache.http.impl.client.HttpClients; import org.apache.http.util.EntityUtils; import java.io.IOException; public class Test { public static void main(String[] args) throws Exception { //1、打开浏览器,创建HttpClient对象 CloseableHttpClient httpClient = HttpClients.createDefault(); //2、输入网址,创建发起Get请求,创建HttpGet对象 HttpGet httpGet = new HttpGet("http://www.itcast.cn"); //3、按回车,发起请求,返回响应,使用httpClient发送请求 CloseableHttpResponse response = httpClient.execute(httpGet); //4、解析响应获取请求,判断状态码是否是200 if(response.getStatusLine().getStatusCode() == 200){ HttpEntity httpEntity = response.getEntity(); String content = EntityUtils.toString(httpEntity,"utf-8"); System.out.println(content); } } }

    点击运行。 项目创建完成!!!

    最新回复(0)