ABOUT


Here for Google Summer of Code?*


This is going to be an amazing year - lots of new things to work on, including JokerTV, a totally open TV receiver, plus several experimental/for fun projects. Projects in C, Node, Python… you name it, we have it. Plus resources for students - we'll give access to a high speed server, all our samples (we'll even ship a portable drive with them anywhere in the world, so don't worry about slow connections).

You are welcome to check out our ideas page (this is it - actual ideas at the bottom of the page) and start early in the community bonding process as well as learning a bit about our code. And of course, we'd love you to stay around even if we are not invited to GSoC or if we cannot invite you as a student.

*Current Page Updated for GSoC 2018




Go visit our organization's GSoC page!

Visit us!
  •  About Us


    What is CCExtractor Development All About?

    AboutUs
      We are a small org, which means that your contribution will have a large impact. It's not going to mean a 0.5% improvement on a big project - it's going to be more than 10% on a medium size one. If you like challenges and want a chance to shine this is your place.


      We have -we think- statistically amazing continuity in the team: Most GSoC students from all the past years are still involved, even if they are no longer eligible as students. They still contribute code, and they mentor both in GSoC and the sister program GCi. As mentors, they also come to the Summer of Code summit which traditionally takes place in October.

      We have *mentors all over the world* (North America, Europe, Asia and Australia), so time zones are never a problem. Our main channel of communication is a Slack channel to which everyone is welcome. We expect all accepted students to be available on Slack very often, even if you don't need to talk to your mentor. This will help you ask questions when necessary, and you might be able to help others out as well while working on your project.

      A mailing list is also available for those that prefer email over slack. It's a new mailing list (the old one hasn't been used in a long time) but it's read by everyone involved in GSoC.

      All our top committers will be mentoring. Many of them are former GSoC students.

  •  About What We Use


    Languages Used!

    Language
      • Core tool that names the organization (CCExtractor) : C (Not C++)

      • Current Windows GUI : C#, and we have another GUI for Linux that's written with Qt, and a small GUI that's integrated into the main program (C).

      • Regression Test Testing Tool : Python, with JS, CSS, and some Shell scripting. Test suite is written in C#.

      • Prototype Real Time Demo Site : NodeJS


      We also have a number of support tools that do a number of different things, from downloading subtitles from streaming services to translating them with Google Translate or DeepL. Most of them are written in Python, but since they are small tools that do their job you don't need to worry much about them. For totally new things you can use whatever tool you feel is best for the job.

  •  About Sample Media and Other Resources


    Resources, resources, resources!

    media-used

      We work with huge files. Not all of them are huge, but many are. We know that many students don't have access to high speed internet. To those students we will ship (as soon as they are selected) a portable hard drive with all our samples. So if your internet connection is not good, don't worry - as long as you can plug a USB drive to your development computer you can participate with us.

      We also have a shared Linux development server with lots of storage and a Gigabit uplink. Students get an account on it and they are welcome to use it. There's nothing there except our own work, so it's a trusted environment (for a server that is connected to internet of course).

      The sample platform also hosts a bunch of samples, both which are small or decently sized.

  •  About The Project and Getting Accepted


    Getting Accepted!

    media-used

      Qualification: On top of -of course- the quality of the proposal, we will be ranking students with a points system (we introduced this last year, and it worked pretty well).

      We don't have a minimum number of required points, but you definitely will need some (with equally good proposals we will rank based on acquired points). This means, the more points you get the more likely you are to be invited to join us during the summer, assuming that your proposal is good.

      You can get points by doing one (or more) of the next options:

      • By solving issues in our GitHub issue tracker (CCExtractor), Sample platform issues (default 1 points per issue unless specified somewhere in the issue page). Most issues have an explicit number of points that you can find in a comment.
      • By joining the community in Slack. You can invite yourself here. (1 point)
      • If you are a former Code-in finalist you start with 1 point. If you were a winner, you start with 2 points. Note that there are just a few developers that meet this, so don't be discouraged if you aren't one of them. Almost no one is, but we'd love to hear from those that are.
      • By sending us a TV sample that has something we don't support. It doesn't have to be from your own country (since hopefully, we already support it), but if it is, so much the better. This is probably hard to get, since we already got all the low hanging fruit. But if your local TV has subtitles you can turn on and off, we'd love a recording.
  •  Best Qualification Tasks


    Qualification Task Status
     Terrible OCR results with Channel 5 (UK) OPEN
     Can't extract multi-track subtitles from .mp4 SOLVED
     CCExtractor won't extract subtitles from TS with no PAT/PMT OPEN
     Automatically switch to correct encoding for 708 subtitles based on PMT data OPEN
     French captions lack accents SOLVED
     CCExtractor is unable to recover position in file after finding a bad NAL due to input corruption OPEN
     Request: Allow to extract several teletext pages in one pass OPEN
     'live' raw data problem OPEN
     DVB Teletext subtitle incomplete SOLVED
     Feature request: Write processed bytes to file OPEN
     Extract telemetry data (which is stored in a subtitle track) from a Drone recording OPEN

    The sample platform's issues are tagged with “gsoc-proposal-task”, so you can easily see what you can work on.