Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help in the use of puppeteer-cluster #277

Closed
ricoxor opened this issue Apr 10, 2020 · 1 comment
Closed

Help in the use of puppeteer-cluster #277

ricoxor opened this issue Apr 10, 2020 · 1 comment
Labels
question Further information is requested

Comments

@ricoxor
Copy link

ricoxor commented Apr 10, 2020

Hello,

I'm practicing with headless browser and I plan to make a little viewerbot. The goal would be to be able to put a site where a stream is broadcasted and to be able to choose a number of viewers to send on the stream with the possibility to increase the number or to reduce it without relaunching the app.

Currently I have some problems with the use of puppeteer-cluster.

1/ I can't find a way to handle the number of active tasks at the same time, how to add or remove at any time. That is to say my number of viewers in this case. Would Puppeteer be better than Puppeteer-cluster for my use?

2/ When the tasks of the cluster go live if I have a timeout problem on a single task it's all the others that crash as well. How can I fix that?

3/ Once the task is launched how to make sure that it never ends, that the viewer is on the page without being detected AFK or that the task is finished.

const {Cluster} = require('puppeteer-cluster');
const vanillaPuppeteer = require('puppeteer')

const {addExtra} = require('puppeteer-extra')
const Stealth = require('puppeteer-extra-plugin-stealth')

async function main() {

  const puppeteer = addExtra(vanillaPuppeteer)
  puppeteer.use(Stealth())

  let viewers = 3;
  let live = 'https://a-live-stream.com';

  const browserArgs = [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-infobars'
  ];

  const proxies = [
    'proxy:port',
    'proxy:port',
    'proxy:port',
  ];

  let perBrowserOptions = [];

  for (let i = 0; i < viewers; i++) {
    perBrowserOptions = [...perBrowserOptions, {args: browserArgs.concat(['--proxy-server=' + proxies[i]])}]
  }

  const cluster = await Cluster.launch({
    puppeteerOptions: {
      headless: false,
      args: browserArgs,
      executablePath: 'C:/Program Files (x86)/Google/Chrome/Application/chrome.exe'
    },
    monitor: false,
    puppeteer,
    concurrency: Cluster.CONCURRENCY_BROWSER,
    maxConcurrency: viewers,
    perBrowserOptions: perBrowserOptions
  });

  cluster.on('taskerror', (err, data) => {
    console.log(`Error crawling ${data}: ${err.message}`);
  });

  const viewer = async ({page, data: url}) => {
    await page.goto(url, {waitUntil: 'networkidle2'})
    const element = await page.$('iframe')
    await element.click()

    console.log('#Viewer live')
    await page.waitFor(3000000)
    console.log('#Closed')
  };

  cluster.queue(live, viewer)
  cluster.queue(live, viewer)
  cluster.queue(live, viewer)


  await cluster.idle()
  await cluster.close()
}

main().catch(console.warn)
@thomasdondorf thomasdondorf added the question Further information is requested label Apr 19, 2020
@thomasdondorf
Copy link
Owner

  1. Sounds like you are looking for a statistics API. This is currently not implemented... See Roadmap for v1.0 #8

  2. What you mean with crash? Does the whole cluster crash? This should not be happening. The library will always try to restart the browser. Regarding your code: Please be more specific and provide a minimal example with the problem.

  3. See keep pages open and do tasks by interval  #147 (comment)

Feel free to reopen with improved code or if anything is unclear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants