爬取 wallhaven图片到本地壁纸库

相貌堂堂
• 阅读 5764

项目地址,另外知乎同名文章也是我发布的,大家可以多多关注

首先观察控制台
爬取 wallhaven图片到本地壁纸库

其次再看本地壁纸库
爬取 wallhaven图片到本地壁纸库

现在进入正题,这个小项目用到了 Jsoup具体版本见 POM),另外还用到了 JDK中的线程池、阻塞队列(生产-消费者模式)、NIO2(文件监听服务 API),所以至少要求 JDK版本为7或者以上

项目分为5个类和一个方法入口类

生产者类(任务:从列表页拿到详情页链接并放入阻塞队列)

public class Producer implements Runnable {

    private String name;
    private BlockingQueue<String> blockingQueue;

    public Producer(String name, BlockingQueue<String> blockingQueue) {
        this.name = name;
        this.blockingQueue = blockingQueue;
    }

    @Override
    public void run() {
        Document doc = null;
        try {
            for(int i = 1; i < 12018; i ++) {
                System.out.println();
                System.out.println();
                System.out.println("current page:" + i);
                System.out.println("-----------------------------------");
                if(i == 1) {
                    doc = Jsoup.connect("https://alpha.wallhaven.cc/latest").get();
                } else {
                    doc = Jsoup.connect("https://alpha.wallhaven.cc/latest?page=" + i).get();
                }
                Element div = doc.getElementById("thumbs");
                Elements sections = div.getElementsByTag("section");
                for (Element ele : sections) {
                    Elements links = ele.getElementsByClass("preview");
                    for (Element e : links) {
                        String href = e.attr("href");
                        blockingQueue.put(href);
                        System.out.println(name + " put " + href);
                    }
                }
            }
            blockingQueue.put("");
            System.out.println(name + " is over");
        } catch (IOException | InterruptedException e) {
            e.printStackTrace();
        } 
    }
}

消费者类(任务:从队列拿到链接并获取图片源地址并将下载任务交给一个缓存线程池)

public class Consumer implements Runnable {

    private String name;
    private BlockingQueue<String> blockingQueue;
    private ExecutorService taskPool;

    public Consumer(String name, BlockingQueue<String> blockingQueue, ExecutorService taskPool) {
        this.name = name;
        this.blockingQueue = blockingQueue;
        this.taskPool = taskPool;
    }

    @Override
    public void run() {
        Document doc = null;
        try {
            String href = null;
            while((href = blockingQueue.take()) != "") {
                System.out.println(name + " take " + href);
                doc = Jsoup.connect(href).get();
                Element img = doc.getElementById("wallpaper");
                String src = "https:" + img.attr("src");
                taskPool.submit(new DownloadTask(src));
            }
            System.out.println(name + " is over");
        } catch (IOException | InterruptedException e) {
            e.printStackTrace();
        } 
    }

}

下载任务执行类(任务:下载图片到本地)

public class DownloadTask implements Runnable {

    private static String path = "C:\\Users\\baiyapeng\\Desktop\\Paper\\";
    private String src;
    private String name;

    public DownloadTask(String src) {
        this.src = src;
        int n = src.lastIndexOf("/");
        this.name = src.substring(++n);
    }

    @Override
    public void run() {
        Response res = null;
        try {
            res = Jsoup.connect(src).ignoreContentType(true).timeout(30000).execute();
            byte[] bytes = res.bodyAsBytes();
            File file = new File(path + name);
            if (!file.exists()) {
                RandomAccessFile raf = new RandomAccessFile(file, "rw");
                raf.write(bytes);
                raf.close();
            }
        } catch (IOException e1) {
            e1.printStackTrace();
        }
    }

}

监听服务类(任务:将文件路径注册到监听服务上并开始监听)

public class ResourceListener {

    private static ExecutorService fixedThreadPool = Executors.newCachedThreadPool();

    private WatchService ws;

    private ResourceListener(String path) {
        try {
            ws = FileSystems.getDefault().newWatchService();
            start();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private void start() {
        fixedThreadPool.execute(new Listener(ws));
    }

    public static void addListener(String path) {
        try {
            ResourceListener resourceListener = new ResourceListener(path);
            Path p = Paths.get(path);
            p.register(resourceListener.ws, StandardWatchEventKinds.ENTRY_CREATE);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

监听回调类(任务:执行回调任务)

public class Listener implements Runnable {

    private WatchService service;

    public Listener(WatchService service) {
        this.service = service;
    }

    @Override
    public void run() {
        try {
            while (true) {
                WatchKey watchKey = service.take();
                List<WatchEvent<?>> watchEvents = watchKey.pollEvents();
                for (WatchEvent<?> event : watchEvents) {
                    System.err.println(event.context() + "已下载");
                }
                watchKey.reset();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        } 
    }
}

方法入口类

public class DownloadTaskExecutor {

    public static void main(String[] args) throws IOException {
        
        ResourceListener.addListener("C:\\Users\\baiyapeng\\Desktop\\Paper\\");
    
        BlockingQueue<String> blockingQueue = new SynchronousQueue<String>(true);
        ExecutorService proservice = Executors.newSingleThreadExecutor();
        ExecutorService conservice = Executors.newSingleThreadExecutor();
        ExecutorService taskPool = Executors.newCachedThreadPool();
        proservice.submit(new Producer("Producer", blockingQueue));
        conservice.submit(new Consumer("Consumer", blockingQueue, taskPool));
        proservice.shutdown();
        conservice.shutdown();
    }

}

最后就是设置壁纸库并设定更换频率
爬取 wallhaven图片到本地壁纸库

感谢大家,有问题可以再评论区留言~~

点赞
收藏
评论区
推荐文章
blmius blmius
4年前
MySQL:[Err] 1292 - Incorrect datetime value: ‘0000-00-00 00:00:00‘ for column ‘CREATE_TIME‘ at row 1
文章目录问题用navicat导入数据时,报错:原因这是因为当前的MySQL不支持datetime为0的情况。解决修改sql\mode:sql\mode:SQLMode定义了MySQL应支持的SQL语法、数据校验等,这样可以更容易地在不同的环境中使用MySQL。全局s
美凌格栋栋酱 美凌格栋栋酱
10个月前
Oracle 分组与拼接字符串同时使用
SELECTT.,ROWNUMIDFROM(SELECTT.EMPLID,T.NAME,T.BU,T.REALDEPART,T.FORMATDATE,SUM(T.S0)S0,MAX(UPDATETIME)CREATETIME,LISTAGG(TOCHAR(
Wesley13 Wesley13
4年前
MySQL部分从库上面因为大量的临时表tmp_table造成慢查询
背景描述Time:20190124T00:08:14.70572408:00User@Host:@Id:Schema:sentrymetaLast_errno:0Killed:0Query_time:0.315758Lock_
Wesley13 Wesley13
4年前
VBox 启动虚拟机失败
在Vbox(5.0.8版本)启动Ubuntu的虚拟机时,遇到错误信息:NtCreateFile(\\Device\\VBoxDrvStub)failed:0xc000000034STATUS\_OBJECT\_NAME\_NOT\_FOUND(0retries) (rc101)Makesurethekern
Wesley13 Wesley13
4年前
FLV文件格式
1.        FLV文件对齐方式FLV文件以大端对齐方式存放多字节整型。如存放数字无符号16位的数字300(0x012C),那么在FLV文件中存放的顺序是:|0x01|0x2C|。如果是无符号32位数字300(0x0000012C),那么在FLV文件中的存放顺序是:|0x00|0x00|0x00|0x01|0x2C。2.  
Stella981 Stella981
4年前
SpringBoot整合Redis乱码原因及解决方案
问题描述:springboot使用springdataredis存储数据时乱码rediskey/value出现\\xAC\\xED\\x00\\x05t\\x00\\x05问题分析:查看RedisTemplate类!(https://oscimg.oschina.net/oscnet/0a85565fa
Wesley13 Wesley13
4年前
mysql设置时区
mysql设置时区mysql\_query("SETtime\_zone'8:00'")ordie('时区设置失败,请联系管理员!');中国在东8区所以加8方法二:selectcount(user\_id)asdevice,CONVERT\_TZ(FROM\_UNIXTIME(reg\_time),'08:00','0
Wesley13 Wesley13
4年前
PHP创建多级树型结构
<!lang:php<?php$areaarray(array('id'1,'pid'0,'name''中国'),array('id'5,'pid'0,'name''美国'),array('id'2,'pid'1,'name''吉林'),array('id'4,'pid'2,'n
Easter79 Easter79
4年前
SpringBoot整合Redis乱码原因及解决方案
问题描述:springboot使用springdataredis存储数据时乱码rediskey/value出现\\xAC\\xED\\x00\\x05t\\x00\\x05问题分析:查看RedisTemplate类!(https://oscimg.oschina.net/oscnet/0a85565fa
Wesley13 Wesley13
4年前
Java日期时间API系列36
  十二时辰,古代劳动人民把一昼夜划分成十二个时段,每一个时段叫一个时辰。二十四小时和十二时辰对照表:时辰时间24时制子时深夜11:00凌晨01:0023:0001:00丑时上午01:00上午03:0001:0003:00寅时上午03:00上午0
相貌堂堂
相貌堂堂
Lv1
所有的快乐,都来源于生活的心动。
文章
3
粉丝
0
获赞
0